Title: Learning, Volatility and the ACC
1Learning, Volatility and the ACC
- Tim Behrens
- FMRIB Psychology, University of Oxford
- FIL - UCL.
2B
Kennerley, et al., Nature Neuroscience, 2006
3B
ACCs
Kennerley et al. Nature Neuroscience, 2006
4Monkeys will sacrifice food opportunities to
look at other monkeys
Rudebeck,et al. Science 2005
5Interest in other individuals is reduced after
ACC gyrus lesion
Rudebeck,et al. Science 2005
6Anatomy - Differences in connections between ACCs
and ACCg.
- Connections unique to the sulcus are mainly with
motor regions - Primary motor cortex
- Premotor cortex
- Parietal motor areas
- Spinal Cord
- ACCs has information about our own actions
7Anatomy - Differences in connections between ACCs
and ACCg.
- Connections unique to the gyrus are mainly with
regions that process emotional and biological
stimuli - Periacqueductal grey
- hypothalamus
- STS/STG
- Insula/Temporal pole connections are stronger
to the gyrus - ACCg has access to information about other
agents.
8Anatomy - shared connections between ACCs and
ACCg.
- Some shared connections
- Orbitofrontal cortex
- Amydala
- Ventral striatum
- ACCg and ACCs are strongly interconnected
- Both regions have access to and influence over
reward and value processing.
9ACC Sulcus and learning about your actions.
10B
ACCs
Kennerley et al. Nature Neuroscience, 2006
11What determines the integration length?
Kennerly et al. Nat Neurosci 2006
Sugrue et al. Science 2005
12VOLATILE Reward probabilities change approximately
every 25 trials
STABLE Reward probabilities change only after
hundreds of trials
Kennerly et al. Nat Neurosci 2006
Sugrue et al. Science 2005
13Reinforcement learning
- We need to continually re-appraise the value of
an action based each new experience.
14Updating beliefs on the basis of new information
Vt1Vt ( a x d )
14
15The learning rate and the value of information.
Vt1Vt ( a x d )
The learning rate should represent the value of
the current information for guiding future
beliefs.
16Relationship with integration length
1737
63
Behrens et al., Nature Neuroscience, 2007
18Vt1Vta x d
Behrens, Woolrich, Walton, Rushworth, Nature
Neuroscience, 2007
19changes in reward estimates occur throughout the
task
as do change in volatility estimates
Behrens, Woolrich, Walton, Rushworth, Nature
Neuroscience, 2007
20Monitor x Volatility
Decide Monitor
Behrens et al., Nature Neuroscience, 2007
21ACC effect size predicts learning rate across
subjects
Behrens, Woolrich, Walton Rushworth Nat Neurosci
2007
22ACC Gyrus and learning about your social partners.
23Interest in other individuals is reduced after
ACC gyrus lesion
Rudebeck et al. Science 2005
24Rudebeck et al., Science, 2006
25Learning about other agents
37
63
Behrens, Hunt, Woolrich, Rushworth Nature 2008
26Sources of information
Probability that correct colour is blue
Probability that confederate advice is good
Behrens, Hunt, Woolrich, Rushworth Nature 2008
27Social information is integrated over time -
behaviour
28Reward Prediction Error
Vt1Vt ( a x d )
Reward -
Expectation
Outcome
Effect size
Time
Behrens, Hunt, Woolrich, Rushworth Nature 2008
29Prediction error on a social partner.
Vt1Vt ( a x d )
Lie event -
Lie prediction
Outcome
Effect size
Time
Behrens, Hunt, Woolrich, Rushworth Nature 2008
30The value of information and the ACC
Vt1Vt ( a x d )
Value of reward information
Value of social information
30
31Combining Information to drive behaviour
Vt1Vt ( a x d )
32Conclusions
- ACC codes a learning signal when information is
observed. - This signal predicts the speed of learning.
- Learning from our own and others actions are
processed in parallel in ACCs and ACCg. - The outputs of these parallel learning processes
are combined in the reward system.
33Acknowledgments
- Matthew Rushworth
- Mark Woolrich
- Laurence Hunt
- Mark Walton
33