Nonnegative least squares for imaging data - PowerPoint PPT Presentation

About This Presentation
Title:

Nonnegative least squares for imaging data

Description:

Nonnegative least squares for imaging data Keith Worsley McGill Jonathan Taylor Stanford and Universit de Montr al John Aston Academia Sinica, Taipei – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 60
Provided by: KeithW64
Category:

less

Transcript and Presenter's Notes

Title: Nonnegative least squares for imaging data


1
Nonnegative least squares for imaging data
  • Keith Worsley
  • McGill
  • Jonathan Taylor
  • Stanford and Université de Montréal
  • John Aston
  • Academia Sinica, Taipei

2
(No Transcript)
3
Nature (2005)
4
Subject is shown one of 40 faces chosen at
random
Happy
Sad
Fearful
Neutral
5
but face is only revealed through random
bubbles
  • First trial Sad expression
  • Subject is asked the expression
    Neutral
  • Response
    Incorrect

75 random bubble centres
Smoothed by a Gaussian bubble
What the subject sees
Sad
6
Your turn
  • Trial 2

Subject response Fearful CORRECT
7
Your turn
  • Trial 3

Subject response Happy INCORRECT (Fearful)
8
Your turn
  • Trial 4

Subject response Happy CORRECT
9
Your turn
  • Trial 5

Subject response Fearful CORRECT
10
Your turn
  • Trial 6

Subject response Sad CORRECT
11
Your turn
  • Trial 7

Subject response Happy CORRECT
12
Your turn
  • Trial 8

Subject response Neutral CORRECT
13
Your turn
  • Trial 9

Subject response Happy CORRECT
14
Your turn
  • Trial 3000

Subject response Happy INCORRECT (Fearful)
15
Bubbles analysis
  • E.g. Fearful (3000/4750 trials)

Trial 1 2 3 4
5 6 7 750
Sum
Correct trials
Thresholded at proportion of correct
trials0.68, scaled to 0,1
Use this as a bubble mask
Proportion of correct bubbles (sum correct
bubbles) /(sum all bubbles)
16
Results
  • Mask average face
  • But are these features real or just noise?
  • Need statistics

Happy Sad
Fearful Neutral
17
Statistical analysis
  • Correlate bubbles with response (correct 1,
    incorrect 0), separately for each expression
  • Equivalent to 2-sample Z-statistic for correct
    vs. incorrect bubbles, e.g. Fearful
  • Very similar to the proportion of correct bubbles

ZN(0,1) statistic
Trial 1 2 3 4
5 6 7 750
Response 0 1 1 0
1 1 1 1
18
Results
  • Thresholded at Z1.64 (P0.05)
  • Multiple comparisons correction for 91200 pixels?
  • Need random field theory

ZN(0,1) statistic
Average face Happy Sad
Fearful Neutral
19
Euler Characteristic blobs - holes
Excursion set Z gt threshold for neutral face
EC 0 0 -7 -11
13 14 9 1 0
Heuristic At high thresholds t, the holes
disappear, EC 1 or 0, E(EC) P(max Z gt
t).
  • Exact expression for E(EC) for all thresholds,
  • E(EC) P(max Z gt t) is extremely accurate.

20
Random field theory
Lipschitz-Killing curvatures of S (Resels(S)
c)
EC densities of Z above t
filter
white noise
Z(s)


FWHM
21
Results, corrected for search
  • Random field theory threshold Z3.92 (P0.05)
  • 3.82 3.80 3.81
    3.80
  • Saddle-point approx (Rabinowitz, 1997 Chamandy,
    2007)?
  • Bonferroni Z4.87 (P0.05) nothing

ZN(0,1) statistic
Average face Happy Sad
Fearful Neutral
22
fMRI data 120 scans, 3 scans each of hot, rest,
warm, rest,
T (hot warm effect) / S.d. t110 if no
effect
23
Linear model regressors
24
Linear model for fMRI time series with AR(p)
errors
? ?

unknown
parameters ?
? ?

25
Unknown latency d of HRF
? ? ?
unknown parameters
26
Example
convolve with stimulus
0
0
2
0.4
2
-2
-2
2
1
stimulus
0.2
0
0
-0.2
-1
0
10
20
10
20
30
40
50
60
70
t
t
Time,
(seconds)
Time,
(seconds)
  • Two interesting problems
  • Estimate the latency shift d and its standard
    error
  • Test for the magnitude ß of the stimulus
    allowing for unknown latency.

27
Test for the magnitude ß of the stimulus allowing
for unknown latency
  • We could do either
  • T-test on ß1 gt 0, allowing for ß2
  • loses sensitivity if d is far from 0
  • F-test on (ß1, ß2) ? 0
  • wastes sensitivity on unrealistic HRFs

28
Cone alternative
  • We know that the
  • magnitude of the response is positive
  • latency shift must lie in some restricted
    interval, say -2, 2 seconds
  • This implies that
  • ß 0
  • -? d ?, where ? 2 seconds
  • This specifies a cone alternative for (ß1 ß, ß2
    dß) (Friman et al. , 2003)

d 2
ß2
Cone angle ? 2 atan(?x2/x1)
cone alternative
d 0
ß1
Null 0
d -2
29
Non-negative least squares
2
x2(t)
x1(t)
1
stimulus
0
-1
10
20
30
40
50
60
70
t
Time,
(seconds)
30
Non-negative least squares
x1(t)
ß2
ß1 0, ß2 0
Cone angle ? angle between x1 and x2
ß1 0 ß2 0
ß1
Null 0
ß1 0, ß2 0
x2(t)
31
Example of three extremes
x3(t) spread 4 seconds
4
3D cone
0 25
4
x1(t) standard
x2(t) delayed 4 seconds
ß3
ß2
ß1
32
General non-negative least squares problem
Footnote Woolrich et al. (2004) replace hard
constraints by soft constraints through a prior
distribution on ß, taking a Bayesian approach.
33
(No Transcript)
34
(No Transcript)
35
Pick a range of say p 150 plausible values of
the non-linear parameter ? ?1,,?p
36
Fitting the NNLS model
  • Simple
  • Do all subsets regression
  • Throw out any model that does not satisfy the
    non-negativity constraints
  • Among those left, pick the model with smallest
    error sum of squares
  • For larger models there are more efficient
    methods e.g. Lawson Hanson (1974).
  • The non-negativity constraints tend to enforce
    sparsity, even if regressors are highly
    correlated (e.g. PET).
  • Why? Highly correlated regressors have huge
    positive and negative unconstrained coefficients
    non-negativity suppresses the negative ones.

37
Example n20, p150, but surprisingly it does
not overfit
30

Y
Yhat
25
Yhat component 1
Yhat component 2
20
15
Tracer
10
5
0

0
1000
2000
3000
4000
5000
6000
Time
Tend to get sparse pairs of adjacent regressors,
suggesting best regressor is somewhere inbetween.
38
P-values?
1 when j?
39
P-values for PET data at a single voxel
p 2 Cone weights w1 ½, w2 ?/2p
j
j
40
The Beta-bar random field
1 when j?
From well-known EC densities of F field
From simulations at a single voxel
Same linear combination!
41
Proof
Z1N(0,1)
Z2N(0,1)
s2
s1
Rejection regions,
Excursion sets,
Threshold t
Z2
Cone alternative
Search Region, S
Z1
Null
42
Euler characteristic heuristic again
Excursion sets, Xt
Search Region, S
EC blobs - holes 1 7
6 5 2 1
1 0
Observed
Expected
Euler characteristic, EC
Threshold, t
EXACT!
43
Proof
44
Steiner-Weyl Tube Formula (1930)
Morse Theory Approach (1995)
  • Put a tube of radius r about the search
  • region ?S

r
Tube(?S,r)
?S
For a Gaussian random field
For a chi-bar random field???
  • Find volume, expand as a power series
  • in r, pull off coefficients

45
Tube(?S,r)
r
?S
Steiner-Weyl Volume of Tubes Formula (1930)
Lipschitz-Killing curvatures are just intrinisic
volumes or Minkowski functionals in the
(Riemannian) metric of the variance of the
derivative of the process
46
S
S
S
Edge length ?
Lipschitz-Killing curvature of triangles
Lipschitz-Killing curvature of union of triangles
47
Non-isotropic data? Use Riemannian metric of
Var(?Z)
ZN(0,1)
ZN(0,1)
s2
s1
Edge length ?
Lipschitz-Killing curvature of triangles
Lipschitz-Killing curvature of union of triangles
48
We need independent identically distributed
random fields e.g. residuals from a linear model

Lipschitz-Killing curvature of triangles
Lipschitz-Killing curvature of union of triangles
Taylor Worsley, JASA (2007)
49
Beautiful symmetry
Steiner-Weyl Tube Formula (1930)
Taylor Gaussian Tube Formula (2003)
  • Put a tube of radius r about the search region
    ?S and rejection region Rt

Z2N(0,1)
Rt
r
Tube(Rt,r)
Tube(?S,r)
r
?S
Z1N(0,1)
t
t-r
  • Find volume or probability, expand as a power
    series in r, pull off coefficients

50
Z2N(0,1)
Rejection region Rt
Tube(Rt,r)
r
Z1N(0,1)
t
t-r
Taylors Gaussian Tube Formula (2003)
51

From well-known EC densities of F field
From simulations at a single voxel
Same linear combination!
52
Proof, n3
53
Power? S 1000cc brain, FWHM 10mm, P
0.05
Event
Block (20 seconds)
1
1
Cone angle ? 78.4o
Cone angle ? 38.1o
0.9
0.9
T-test on ß1
0.8
0.8
Beta-bar test
0.7
0.7
F-test on (ß1, ß2)
Cone weights w1 ½ w2 ?/2p
0.6
0.6
0.5
0.5
Power of test
Power of test
0.4
0.4
0.5
2
x1(t)
0.3
0.3
0
Response
0
Response
0.2
0.2
x2(t)
-0.5
-2
0.1
0.1
0
20
40
0
20
40
Time t (seconds)
Time t (seconds)
0
0
0
1
2
3
0
1
2
3
d
d
Shift
of HRF (seconds)
Shift
of HRF (seconds)
54
Bubbles task in fMRI scanner
  • Correlate bubbles with BOLD at every voxel
  • Calculate Z for each pair (bubble pixel, fMRI
    voxel) a 5D image of Z statistics

Trial 1 2 3 4
5 6 7 3000
fMRI
55
Thresholding? Cross correlation random field
  • Correlation between 2 fields at 2 different
    locations,
  • searched over all pairs of locations, one in S,
    one in T
  • Bubbles data P0.05, n3000, c0.113, T6.22

Cao Worsley, Annals of Applied Probability
(1999)
56
NNLS for bubbles?
  • At the moment, we are correlating Y(t) fMRI
    data at each voxel with each of the 240x38091200
    face pixels as regressors x1(t),,x91200(t)
    separately
  • Y(t) xj(t)ßj
    z(t)? e(t).
  • We should be doing this simultaneously
  • Y(t) x1(t)ß1
    x91200(t)ß91200 z(t)? e(t).
  • Obviously impossible observations(3000) ltlt
    regressors(91200).
  • Maybe we can use NNLS ß1 0, , ß91200 0.
  • It should enforce sparsity over ß activation at
    face pixels, provided
  • observations(3000) gtgt dimensions of cone
    resels of face(146.2)
  • We can threshold Beta-bar over brain voxels to
    Plt0.05 using above.
  • Result will be an face image of isolated local
    maxima for each voxel.
  • It will tell you which brain voxels are
    activated, but not which face pixels.
  • Might be a huge computational task!
  • Interactions? Y x1 x91200 x1x2
    z.

57
MS lesions and cortical thickness
  • Idea MS lesions interrupt neuronal signals,
    causing thinning in down-stream cortex
  • Data n 425 mild MS patients

5.5
5
4.5
4
Average cortical thickness (mm)
3.5
3
2.5
Correlation -0.568, T -14.20 (423 df)
2
Charil et al, NeuroImage (2007)
1.5
0
10
20
30
40
50
60
70
80
Total lesion volume (cc)
58
MS lesions and cortical thickness at all pairs of
points
  • Dominated by total lesions and average cortical
    thickness, so remove these effects as follows
  • CT cortical thickness, smoothed 20mm
  • ACT average cortical thickness
  • LD lesion density, smoothed 10mm
  • TLV total lesion volume
  • Find partial correlation(LD, CT-ACT) removing TLV
    via linear model
  • CT-ACT 1 TLV LD
  • test for LD
  • Repeat for all voxels in 3D, nodes in 2D
  • 1 billion correlations, so thresholding
    essential!
  • Look for high negative correlations
  • Threshold P0.05, c0.300, T6.48

59
Cluster extent rather than peak height (Friston,
1994)
  • Choose a lower level, e.g. t3.11 (P0.001)
  • Find clusters i.e. connected components of
    excursion set
  • Measure cluster extent
  • by resels
  • Distribution
  • fit a quadratic to the
  • peak
  • Distribution of maximum cluster extent
  • Bonferroni on N clusters E(EC).

Z
D1
extent
t
Peak height
s
Cao and Worsley, Advances in Applied Probability
(1999)
Write a Comment
User Comments (0)
About PowerShow.com