Title: Hidden Process Models with applications to fMRI data
1Hidden Process Modelswith applications to fMRI
data
- Rebecca Hutchinson
- Oregon State University
- Joint work with Tom M. Mitchell
- Carnegie Mellon University
- August 2, 2009
- Joint Statistical Meetings, Washington DC
2Introduction
- Hidden Process Models (HPMs)
- A probabilistic model for time series data.
- Designed for data generated by a collection of
latent processes. - Example domain
- Modeling cognitive processes (e.g. making a
decision) in functional Magnetic Resonance
Imaging time series. - Characteristics of potential domains
- Processes with spatial-temporal signatures.
- Uncertainty about temporal location of processes.
- High-dimensional, sparse, noisy.
3fMRI Data
Hemodynamic Response
Features 5k-15k voxels, imaged every
second. Training examples 10-40 trials (task
repetitions).
Signal Amplitude
Neural activity
Time (seconds)
4Study Pictures and Sentences
Press Button
View Picture
Read Sentence
Read Sentence
View Picture
Fixation
Rest
4 sec.
8 sec.
t0
- Task Decide whether sentence describes picture
correctly, indicate with button press. - 13 normal subjects, 40 trials per subject.
- Sentences and pictures describe 3 symbols , ,
and , using above, below, not above, not
below. - Images are acquired every 0.5 seconds.
5Goals for fMRI
- To track cognitive processes over time.
- Estimate hemodynamic response signatures.
- Estimate process timings.
- Modeling processes that do not directly
correspond to the stimuli timing is a key
contribution of HPMs! - To compare hypotheses of cognitive behavior.
6 Process 1 ReadSentence Response signature
W Duration d 11 sec. Offsets W 0,1
P(?) q0,q1
Process 2 ViewPicture Response signature
W Duration d 11 sec. Offsets W 0,1
P(?) q0,q1
Processes of the HPM
v1 v2
v1 v2
Input stimulus ?
sentence
picture
Timing landmarks ?
Process instance ?2 Process h 2 Timing
landmark ?2 Offset O 1 (Start time ?2 O)
?1
?2
One configuration c of process instances
?1, ?2, ?k
?1
?2
?
Predicted mean
N(0,s1)
v1 v2
N(0,s2)
7HPM Formalism
- HPM ltH,C,F,Sgt
- H lth1,,hHgt, a set of processes (e.g.
ReadSentence) - h ltW,d,W,Qgt, a process
- W response signature
- d process duration
- W allowable offsets
- Q multinomial parameters over values in W
- C ltc1,, cCgt, a set of possible configurations
- c ltp1,,pLgt, a set of process instances
- lth,l,Ogt, a process instance (e.g.
ReadSentence(S1)) - h process ID
- timing landmark (e.g. stimulus presentation of
S1) - O offset (takes values in Wh)
- C a latent variable indicating the correct
configuration - S lts1,,sVgt, standard deviation for each voxel
8HPMs the graphical model
Configuration c
Timing Landmark l
The set C of configurations constrains the
joint distribution on h(k),o(k) " k.
Process Type h
Offset o
Start Time s
S
p1,,pk
observed
unobserved
Yt,v
t1,T, v1,V
9Encoding Experiment Design
Processes
Input stimulus ?
Constraints Encoded h(p1) 1,2 h(p2)
1,2 h(p1) ! h(p2) o(p1) 0 o(p2) 0 h(p3)
3 o(p3) 1,2
ReadSentence 1
ViewPicture 2
Timing landmarks ?
?2
?1
Decide 3
Configuration 1
Configuration 2
Configuration 3
Configuration 4
10Inference
- Over C, the latent indicator of the correct
configuration - Choose the most likely configuration, where
- Yobserved data, Dinput stimuli, HPMmodel
11Learning
- Parameters to learn
- Response signature W for each process
- Timing distribution Q for each process
- Standard deviation s for each voxel
- Expectation-Maximization (EM) algorithm to
estimate W and Q. - E step estimate the probability distribution
over C. - M step update estimates of W (using reweighted
least squares), Q, and s (using standard MLEs)
based on the E step.
12Process Response Signatures
- Standard Each process has a matrix of
parameters, one for each point in space and time
for the duration of the response (e.g. 24). - Regularized Same as standard, but learned with
penalties for deviations from temporal and/or
spatial smoothness. - Basis functions Each process has a small number
(e.g. 3) weights for each voxel that are combined
with a basis to get the response.
13Models
- HPM-GNB ReadSentence and ViewPicture,
duration8sec. (no overlap) - an approximation of Gaussian Naïve Bayes
classifier, with HPM assumptions and noise model - HPM-2 ReadSentence and ViewPicture,
duration12sec. (temporal overlap) - HPM-3 HPM-2 Decide (offsets0,7 images
following second stimulus) - HPM-4 HPM-3 PressButton (offsets -1,0
following button press)
14Evaluation
- Select 1000 most active voxels.
- Compute improvement in test data log-likelihood
as compared with predicting the mean training
trial for all test trials (a baseline). - 5-fold cross-validation per subject mean over 13
subjects.
Standard Regularized Basis functions
HPM-GNB -293 2590 2010
HPM-2 -1150 3910 3740
HPM-3 -2000 4960 4710
HPM-4 -4490 4810 4770
15Interpretation and Visualization
- Timing for the third (Decide) process in HPM-3
- (Values have been rounded.)
- For each subject, average response signatures for
each voxel over time, plot result in each spatial
location. - Compare time courses for the same voxel.
Offset 0 1 2 3 4 5 6 7
Stand. 0.3 0.08 0.1 0.05 0.05 0.2 0.08 0.15
Reg. 0.3 0.08 0.1 0.05 0.05 0.2 0.08 0.15
Basis 0.5 0.1 0.1 0.08 0.05 0.03 0.05 0.08
16Standard
17Regularized
18Basis functions
19Time courses
Basis functions
Standard
The basis set (Hossein-Zadeh03)
Regularized
20Related Work
- fMRI
- General Linear Model (Dale99)
- Must assume timing of process onset to estimate
hemodynamic response. - Computer models of human cognition (Just99,
Anderson04) - Predict fMRI data rather than learning parameters
of processes from the data. - Machine Learning
- Classification of windows of fMRI data (overview
in Haynes06) - Does not typically model overlapping hemodynamic
responses. - Dynamic Bayes Networks (Murphy02, Ghahramani97)
- HPM assumptions/constraints can be encoded by
extending factorial HMMs with links between the
Markov chains.
21Conclusions
- Take-away messages
- HPMs are a probabilistic model for time series
data generated by a collection of latent
processes. - In the fMRI domain, HPMs can simultaneously
estimate the hemodynamic response and localize
the timing of cognitive processes. - Future work
- Automatically discover the number of latent
processes. - Learn process durations.
- Apply to open cognitive science problems.
22References
John R. Anderson, Daniel Bothell, Michael D.
Byrne, Scott Douglass, Christian Lebiere, and
Yulin Qin. An integrated theory of the mind.
Psychological Review, 111(4)10361060, 2004.
http//act-r.psy.cmu.edu/about/. Anders M. Dale.
Optimal experimental design for event-related
fMRI. Human Brain Mapping, 8109114,
1999. Zoubin Ghahramani and Michael I. Jordan.
Factorial hidden Markov models. Machine
Learning, 29245275, 1997. John-Dylan Haynes
and Geraint Rees. Decoding mental states from
brain activity in humans. Nature Reviews
Neuroscience, 7523534, July 2006. Gholam-Ali
Hossein-Zadeh, Babak A. Ardekani, and Hamid
Soltanian-Zadeh. A signal subspace approach for
modeling the hemodynamic response function in
fmri. Magnetic Resonance Imaging, 21835843,
2003. Marcel Adam Just, Patricia A. Carpenter,
and Sashank Varma. Computational modeling of
high-level cognition and brain function. Human
Brain Mapping, 8128136, 1999.
http//www.ccbi.cmu.edu/project
10modeling4CAPS.htm. Kevin P. Murphy. Dynamic
bayesian networks. To appear in Probabilistic
Graphical Models, M. Jordan, November 2002.