Title: AG 1: Irrationale Entscheidungs und Gedchtnisprozesse im Gehirn
1Learning from rewards and punishment
- AG 1 Irrationale Entscheidungs- und
Gedächtnisprozesse im Gehirn - Amory Faber
Sommerakademie St.Johann, 30. August 12.
September 2009
2Overview
- Background knowledge
- Rewards and punishers
- Reward neurons
- Learning from errors
- Differential dopamine responses
- Prediction error signal
- Rewarding stimuli
- Aversive stimuli
- Experimental evidence
- Single-cell recording in pictures
- Lateral habenula as a source of negative reward
signals in dopamine neurons - Phasic excitation of dopamine neurons in ventral
VTW by noxious stimuli
3Rewards and punishers
- Reward objects for animals
- Food
- Liquids (e.g. orange juice)
vegetative rewards - Sex
- Touch (stroking)
- Presentation of novel objects
- Motivational effects of rewards
- Generate approach and consumatory behaviour
- Constitute positive outcomes of the preceding
stimuli and actions - Serve as positive reinforcers (come back for
more) - Produce reward predictions used in decision
making
4Rewards and punishers
- Punishers used in the lab
- Electric shocks
- Painful pinches
- Air puffs
- Hypertonic saline
- Motivational effects of punishers
- Produce withdrawal behaviour
- Constitute negative outcomes
- Serve as negative reinforcers
- Produce aversive predictions for decision making
- Lead to avoidance (if possible)
- ? motivationally opposite effects to rewards!
5In search of reward neurons
- Reward essential for survival (learning!)
- No specific sensory receptors
- Explicit neural signals for reward?
- ? neurons in various brain areas respond to
rewards (orbitofrontal, premotor and prefrontal
cortex, striatum, amygdala, midbrain) - ? are they all reward neurons?
-
-
6In search of reward neurons
- Stimulus properties
- spatial position
- visual object features (colour and shape)
- motivational features (reward prediction)
- Only midbrain dopamine neurons signal the pure
reward value of objects (irrespective of the
sensory components) - ? extract the reward component from stimulus
- ? dopamine neurons reward neurons
(Schultz, 2007) - Dopamine (DA) neurons are activated
preferentially by rewards, but only rarely by
punishers - Burst activity (phasic response)
(Schultz, 2006)
7How does the response of reward neurons look like?
- Full learning episode in a
- DA neuron
- Initial response to reward
- Acquired response to CS (Pavlovian
conditioning) - Response to reward itself
- disappears
- ? neural changes occur in parallel with
behavioural changes!
modified from Schultz (2006)
8Extinction of learned response
- Omission of reward ? extinction
modified from Schultz (2006)
9Errors and learning
- Errors contribute to the self-organization of
behaviour - Predictions are established
- Current input is compared with predictions from
previous experience - If mismatch ? prediction-error signal!
- This signal might trigger synaptic modifications
? predictions and behaviour are changed - Reiterations, until behavioural outcome matches
the predictions (no error)
10Errors and learning
- Errors are necessary for learning
- Not only in behavioural learning, but also at
single-neuron-level? - positive (unpredicted reward)
- Reward-prediction error
- negative (omission of predicted
reward) - Reward-prediction error difference between
predicted and obtained rewards - Can this be coded by DA neurons?
11Prediction error signal
- Dopamine neurons emit a prediction error
signal!
- Activation after an unpredicted reward
(positive prediction error) - No response to fully predicted reward
(no prediction error) - Depression after omission of a predicted reward
(negative prediction error)
from Schultz (2006)
12Prediction error signal
- DA neurons show bidirectional coding of
reward-prediction errors - dopamine response reward occurred reward
predicted - Responses are graded
- (i.e. if only partial prediction error ? smaller
error signal) - They also code temporal information
- (time shift of reward elicits new signal)
from Schultz (2006)
13Prediction error signal during learning
One learning episode
from Schultz (2006)
14Rewarding stimuli
- Typical response to rewards
- Phasic activation (burst firing) with short
latency - Phasic activation in dopamine neurons is
- elicited by conditioned visual, auditory and
somatosensory reward-predicting stimuli - irrespective of spatial position and sensory
stimulus attributes - modulated by motivation of the animal
15Aversive stimuli
- Dopaminergic neurons respond with depressions
- Comparison with responses to rewarding stimuli
- Depression instead of excitation
- Longer latencies (5-10 times slower)
- Last for several secs
- ? dopamine neurons distinguish clearly between
aversive and rewarding stimuli!
16(No Transcript)
17Single-cell recording
- Subjects leeches
- Leech preparation, extract one ganglion
- Find a neuron, lower electrode into it
- Record and amplify the signal of this single cell
- (e.g. Retzius cell)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24Anatomical orientation Lateral habenula
- Habenula neurons (LHb) project to substantia
nigra -
25Anatomical orientation Substantia nigra
- DA neurons are located in the substantia nigra
(pars compacta)
26Introduction
- DA neurons key components of reward system
- Respond to rewards (or to stimuli that predict
rewards) - How are they provided with this reward-related
information? - Many brain areas project to SN ? one of them is
the lateral habenula - Lateral habenula has been implicated in
- Anxiety
- Stress
- Pain
- Avoidance learning
- Attention
- Human reward processing
27Method recording
- Two rhesus monkeys
- Plastic head holders and recording chambers
surgically fixed to skull - Single-cell recordings and electrical stimulation
from lateral habenula and DA neurons (using MRI
to estimate localization, and histological
staining of brain sections after sacrificing the
monkey) - Stable action potentials from 74 habenula neurons
? task-related responses in 49 of them ? n 49 - Deeper parts of the lateral habenula were not
explored fully ? other types of neurons with
different properties could also be existent
- Only DA neurons that responded to
reward-predicting stimuli with phasic excitation
were selected ? n 62
28Method experimental design
- Visual saccade task
- Task to quickly make a saccade to the target
- Correct saccades were signalled by a tone after
200 ms and simultaneously rewarded - Saccades to one direction were rewarded, the
others not - (reversed after 24 trials)
29Results - behavioural
- Significantly faster in rewarded trials!
30Results neuronal responses
- At first glance
- Habenula neurons get
- excited by non-reward-predicting targets (blue
line) - inhibited by reward-predicting ones
- (red line)
31Results
- One line from left to right represents one trial.
The first 24 trials were rewarded, the next 24
not etc. - Outcome tone plus reward (only in reward
trials) - Pink line saccade onsets (varying from trial to
trial) - Cyan line outcome onsets (depending on saccade
onset) - Spikes (dots) were aligned to
- target onset (left) and
- outcome onset (right)
- ? in reward trials, saccade onset is earlier
(shorter reaction times) - ? in reward trials, there is less activity than
in no reward trials (red circle)
reward
no reward
32Results
- 43 neurons showed a sig. main effect of reward
contingency (reward vs. no reward) - only 10 of target position
- ? post-target response is mainly influenced by
reward contingency! - ? no reward trials activity ?
- reward trials activity ?
reward
no reward
33 34Differential results
- Habenula neurons
- excited by non-reward-predicting targets
- inhibited by reward-predicting ones
- Dopamine neurons
- reversed!
35Tentative conclusions
- Habenula and dopamine neuronal activity are
causally related - In no reward trials, excitation of habenula
neurons precedes the inhibition of DA neurons ?
inhibitory influence? - Habenula neurons affect dopamine reward responses
(by sending inhibitory input) - In reward trials, inhibition of habenula neurons
does not precede the excitation of DA neurons ?
no excitatory influence
36Theoretical framework?
- Lateral habenula is involved in negative reward
processing while DA neurons are involved in
positive reward processing - Opponent-process theory (Solomon, 1974)
- Interactions between an appetitive and an
aversive system - Emotions are paired If one emotion is
experienced (e.g. happiness), the other (e.g.
sadness) is suppressed. - Rebound reaction towards the suppressed emotion
after the end of stimulation. - A response X wins a match, gets an award and
experiences great feelings of joy. - B response After a few hours, X feels a bit let
down and sad. - Parachute jumpers
37Theoretical framework?
- Sudden introduction of pleasurable / aversive
stimulus - ? affective reaction
- Termination of stimulus
- ? affective reaction disappears
- ? affective after-reaction
- (opposite quality)
Do you agree?
Standard pattern of affective dynamics
from Solomon (1974)
38(No Transcript)
39- Introduction
- apart from substantia nigra, there are also many
dopaminergic neurons in the VTA (ventral
tegmental area, part of the midbrain) - most studies focus on the dorsorostral VTA ?
neglect ventromedial DA neurons - Do all DA neurons encode the same information?
What about aversive stimuli? Do all DA neurons
get inhibited by them? - Method
- Recording from neurons in dorsal and ventral VTA
- Anesthetized rats
- Intense noxious stimuli (electric shock to hind
paw)
40Results
- Dorsal part of VTA ? inhibition or no response to
noxious footshock (as expected) - Ventral part of VTA ? strong excitation!
41Discussion
- Information coding in DA neurons Two competing
theories - DA neurons are only activated by rewards
(Schultz, 1998) - DA neurons are activated by all salient stimuli
(Redgrave et al., 1999) - Reconciliation
- 2 functionally and anatomically distinct VTA
dopamine systems! - Dorsal VTA activated by rewards
- inhibited by noxious stimuli
- Ventral VTA activated by noxious stimuli
- ??? by rewards
42References
- Matsumoto, M. Hikosaka, O. Lateral habenula as
a source of negative reward signals in dopamine
neurons. Nature 447, 1111-5 (2007). - Matsumoto M, Hikosaka O. Two types of dopamine
neuron distinctly convey positive and negative
motivational signals. Nature. 2009 Jun 11
459(7248) 837-41. - Brischoux, F., Chakraborty, S., Brierley, D.I.
Ungless, M.A. Phasic excitation of dopamine
neurons in ventral VTA by noxious stimuli. Proc
Natl Acad Sci U S A 106, 4894-9 (2009). - Schultz, W. Behavioral dopamine signals. Trends
Neurosci (2007). - Schultz, W. Behavioral theories and the
neurophysiology of reward. Annu Rev Psychol 57,
87-115 (2006). - Solomon, R.L. Corbit, J.D. An opponent-process
theory of motivation. I. Temporal dynamics of
affect. Psychol Rev, 81, 119-145 (1974).
43Thank you very much for your attention!