The Basal Ganglia (Lecture 6)

1 / 40
About This Presentation
Title:

The Basal Ganglia (Lecture 6)

Description:

Now known to be insufficient for goal-directed behaviour (Daw and Dayan), which ... The primate neostriatum is organized into cell islands or clusters (striosomes ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 41
Provided by: harry8

less

Transcript and Presenter's Notes

Title: The Basal Ganglia (Lecture 6)


1
The Basal Ganglia(Lecture 6)
  • Harry R. Erwin, PhD
  • COMM2E
  • University of Sunderland

2
Why is this important?
  • Not well-understood
  • Hot research area
  • Apparently underlies reward learning.
  • Related to the production of behaviour.
  • May play a role in spatial localization.
  • Now known to be insufficient for goal-directed
    behaviour (Daw and Dayan), which seems to involve
    forward models in the prefrontal cortex or some
    specialised processing in the basal ganglia. Care
    for a Nobel Prize?solve this!

3
Resources
  • Shepherd, G., ed., 2004, The Synaptic
    Organization of the Brain, 5th edition, Oxford
    University Press.
  • http//scat-he-g4.sunderland.ac.uk/harryerw/phpwi
    ki/index.php/BasalGanglia
  • http//www.unifr.ch/biochem/DREYER/BG.html

4
Reinforcement Learning
  • Montague, PR, Hyman, SE JD Cohen, 2004,
    Computational roles for dopamine in behavioural
    control, Nature, 431760-767, 14 October 2004.
  • Reinforcement learning theories discuss how
    (habitual) behaviour is organized in response to
    rewards or reinforcers. This is not stimulus
    response learning. This is also not how
    goal-oriented behaviour is learned.

5
How it Works
  • The 'reinforcement signal' distribution measures
    the current value of the possible states of the
    agent.
  • The current state of the agent is converted into
    a 'value' using a 'value function'.
  • A 'policy function' then maps the agent's states
    to its possible actions, with the probability of
    each possible action weighted by the value of the
    next state produced by the action.

6
Temporal Difference Learning
  • A form of reinforcement learning of interest here
    is temporal difference learning, where
  • current TD error current reward gammanext
    prediction - current prediction.
  • Supports the learning of a plan leading to a
    reward.

7
Actor-Critic Model
  • Sutton and Barto, 1998, Reinforcement Learning,
    MIT Press, describe a mechanism for bootstrapping
    reinforcement learning.
  • Actor-critic methods have a separate memory
    structure to explicitly represent the policy
    independent of the value function. The policy
    structure is known as the actor, because it is
    used to select actions, and the estimated value
    function is known as the critic, because it
    criticizes the actions made by the actor.
  • The critique takes the form of a TD error
    estimate.

8
The Algorithm
  • ?t rt1 ?V(st1) - V(st),
  • where rt1 is the actual reward at time t1,
  • st is the state at time t,
  • V(s) is the current perceived value of the state,
    s, and
  • ? is the discount rate that translates a value at
    time t1 to a lower value at time t.
  • ?t is the TD error estimate at time t1 of
    following a specific action, a, at time t.
  • V(s) is zero at terminal states, and rt is zero
    unless there is a real reward at time t.

9
Interpretation
  • ?t is the quantity that appears to correspond to
    the dopamine level output by the basal ganglia to
    the cortex (Schultz, et al.).
  • How are V(s) and the preferences for the various
    actions, a, updated?
  • Given a set of actions, a, let p(st, at) be the
    preference for action a at time t given state s.
  • Then let the probability of picking a be ?(s,a)
    exp(p(s,a))/?(p(s,ab)) summed over all reasonable
    actions ab.

10
Learning Processes
  • Now, update the function V(st,at) by adding ?t
    times some learning rate (less than one).
  • Update p(st, at) by adding ?t times another
    learning rate (less than one).
  • Thats all, folks.
  • Note the state space is very large.
  • Actor-critic learning cannot cope with changing
    goal values.

11
A Few Points
  • Actor-critic learning works better for high-level
    rather than low-level actions. Somehow the
    biological system is able to shift up.
  • Note that the error, ?t, can be either positive
    or negative. The basal ganglia output both
    dopamine () and GABA (-) to represent the error.
    Cocaine has the property of producing an error
    signal that is always positive, which really
    fouls up the learning process.
  • Mirror neurons may play a role in this and autism
    may be a malady of this system.

12
The Bootstrap Issue
  • To make this work, the critic has to either
    innately know the rewards for various actor
    actions, or it has to learn them.
  • The resulting bootstrap problem is of
    particular importance in biological systems that
    might implement the model.
  • One approach might be for the critic to reward
    all actions indiscriminately and then as noxious
    stimuli are reported by sensory systems, reduce
    the corresponding rewards.
  • Is this biologically realistic?

13
The Basal Ganglia
  • A richly connected set of brain nuclei in the
    fore- and mid-brain of amniotes.
  • Degenerative diseases tend to produce severe
    movement deficits, but there is reason to believe
    the function of the basal ganglia is more
    generalthe selection among candidate movements,
    goals, strategies, and interpretations of sensory
    information.
  • (Wilson, 2004, in Shepherd, from which much of
    this presentation is derived).

14
Rostral Anatomy
lthttp//thalamus.wustl.edu/course/cerebell.htmlgt
15
Medial Anatomy
lthttp//thalamus.wustl.edu/course/cerebell.htmlgt
16
Caudal Anatomy
lthttp//thalamus.wustl.edu/course/cerebell.htmlgt
17
Nuclei of the Basal Ganglia
  • The most prominent are the following
  • Caudate Nucleus
  • Putamen or Striatum
  • Nucleus Accumbens
  • Globus Pallidus (GP)
  • external segment (GPe), internal segment (GPi)
  • Substantia Nigra (SN)
  • pars reticulata (SNr), pars compacta (SNc)
  • Subthalamic Nucleus
  • The two largest sources of input are the cerebral
    cortex and thalamus

18
BG Circuits
(from Dreyer, http//www.unifr.ch/biochem/DREYER/B
G.html )
19
Neostriatum
  • The neostriatum consists of the caudate nucleus,
    the putamen, and the nucleus accumbens.
  • For the caudate nucleus and putamen, inputs from
    sensory, motor, and association cortical areas
    converge with inputs from the thalamic
    intralaminar nuclei, dopaminergic inputs from the
    SNc, and 5HT inputs from the dorsal Raphe'
    nucleus (serotoninergic).
  • This subsystem supports planning and
    reinforcement learning involving the PFC.

20
Putamen
  • A portion of the basal ganglia that forms the
    outermost part of the lenticular nucleus.
  • The motor and somatosensory cortices, the
    intralaminar nuclei of the thalamus, and the
    substantia nigra project to the putamen.
  • The putamen projects to premotor and
    supplementary motor areas of cortex via the
    globus pallidus and thalamus.
  • Coextensive with the insula, which has been found
    to contain mirror neurons.

21
Nucleus Accumbens
  • There are similar connections from the limbic
    cortex (emotional) and hippocampus, converging
    with inputs from the ventral tegmental area (VTA)
    in the nucleus accumbens.
  • The VTA is dopaminergic and seems to play a role
    in reward learning.
  • This subsystem appears to support emotional
    learning.

22
Input Structure
  • The cortex, thalamus, and amygdala provide
    glutamergic input to the neostriatum (and can
    produce LTP or LTD).
  • Most neostriatal interneurons are GABAergic,
    except the cholinergic cells, which are
    neuromodulatory, and the output of the principal
    cells is also GABAergic.

23
Neostriatal Structure
  • Consist mainly of principle neurons and afferent
    fibres, with smaller populations of interneurons.
  • The neostriatum appears to be a functional
    remapping of the cortex, based on common
    interests of some sort. For example the neurons
    concerned with a finger will tend to project to a
    common area.
  • Coincidence detection important.

24
Neostriatal Neurons
  • GABAergic principal neurons firing rarely and for
    short periods of time (100-3000 msec).
  • The axons emit local collaterals to form an
    extremely rich arborization and then project to
    their long-range destinations.
  • Approximately half are direct pathway neurons and
    the other half are indirect pathway neurons. It's
    unclear in Wilson, but it may be that only the
    direct pathway neurons are collateralized.

25
Neostriatal Interneurons
  • A number of rare types (eight to nine estimated).
    Three major types as follows
  • Giant cholinergic interneurons forming a dense
    plexus of extremely fine axonal branches. Tonic
  • GABA/parvalbumin-containing basket cells. Very
    similar to basket cells of the hippocampus and
    cerebral cortex. Linked by gap junctions.
  • Somatostatin (SOM)/nitric oxide synthetase
    (NOS)-containing interneurons. A neuromodulatory
    function. Probably GABAergic.

26
Neostriatal Outputs
  • The output of the neostriatum projects to the
    GPe, GPi, and SNr.
  • The GPi and SNr project (GABAergic projections)
    outside the basal ganglia to the thalamus (and
    mostly from there to the frontal cortex), the
    lateral habenular nucleus, and the deep layers of
    the superior colliculus.
  • The GPe projects mostly to the subthalamic
    nucleus, which also receives frontal input and
    finally projects to the GPe, GPi, and SN.

27
Intermediate Processing
  • At the GP and SN, most afference is from the
    neostriatum, with secondary input from the
    subthalamic nucleus.
  • The GPe projects to the GPi and SNr and has
    recurrent local inhibitory connections.
  • The GPe also receives some input from the
    cerebral cortex and thalamus.
  • The subthalamic neurons receive excitatory inputs
    from the cortex and inhibitory input from the GPe.

28
GP Processing
  • The principal cells of the GP are inhibitory,
    receive excitatory input from the subthalamic
    nucleus, and inhibitory input from the
    neostriatum.
  • The GPe inhibits the GPi and the SNr, which are
    the output nuclei of the basal ganglia.

29
Phasic/Tonic
  • The principal cells of the SNc are dopaminergic
    and neuromodulatory. The SNc and the VTA seem to
    encode rewards.
  • The cells of the GP and SN fire tonically, at
    very high rates, the GP and SNr inhibiting
    neurons in the thalamus and SC.
  • Phasic firing of neostriatal neurons produces a
    pause in this tonic firing, allowing thalamic and
    SC neurons to respond to input. (This can also
    terminate tonic activity in the cortex.)

30
Detailed Neostriatal Projections
  • There are two pathways
  • Direct pathway neurons with direct projections
    to GPi and SN (possibly in addition to the GPe),
    directly playing a role in the output of the
    basal ganglia.
  • Indirect pathway neurons that project only to
    GPe. These affect the output of the basal ganglia
    via projections of the subthalamic nucleus and
    the GPe.

31
Cell Counts
  • Count of the neostriatum is estimated at about
    100,000,000 neurons.
  • The GP is about 700,000 neurons in toto, 170,000
    in the GPi. Highly convergent.
  • Spiking in the GP and SN is very localized.
  • Principal cells of the neostriatum receive about
    11,000 afferent synapses from about the same
    number of thalamic and cortical neurons.

32
Patch Structure
  • The primate neostriatum is organized into cell
    islands or clusters (striosomes or patches) in a
    background of lesser cellular density (the
    matrix). Afferent fibres observe this
    compartmentalization, with some cortical regions
    projecting to each.
  • Infragranular pyramidal neurons (layers 5 and 6)
    seem to project to the patches, while
    extragranular neurons (layers 2 and 3) project to
    the matrix.

33
Targets of Patches
  • The patches project preferentially to the
    dopaminergic neurons of the SNc, while the matrix
    projects to the SNr (non-dopaminergic neurons
    projecting to the thalamus and SC),
  • Results in two parallel pathways (in addition to
    the direct and indirect pathways, which are
    present in both).
  • Interneurons in the neostriatum may provide
    intercommunication between the two paths.

34
General Role of the Basal Ganglia
  • The basal ganglia are suspected of being a system
    that detects candidate movements, goals,
    strategies, or interpretations of sensory
    patterns and releases responses.
  • They seem to be a multisensory integration
    system, and this seems particularly the case with
    reference to the SC.

35
How it May Work
  • DA neurons fire in response to the resolution of
    uncertainty about the prospects for reward,
    providing a training signal for the neostriatal
    system
  • These fire more at the moment when the animal
    recognizes it can begin a behavioral sequence
    that will end with a reward.
  • Pause when an expected reward isnt received.
  • The neostriatum thus detects patterns of cortical
    activity associated with future reward,
    associating values to situations.

36
Why Two Neostriatal Areas?
  • The matrix seems to learn what has worked in the
    past.
  • The patches learn which cortical inputs are best
    able to predict the value of particular
    situations.
  • Patches might use dopaminergic signals based on
    current knowledge to learn how to predict
    dopaminergic signals more accurately. (Houk,
    Adams, and Barto)
  • To avoid a bootstrap problem, there has to be
    innate neural connectivity so that immediate
    rewards for behaviour are signalled to the SN via
    the patches.

37
Basic Mechanism of the BG
  • Disinhibition of proposed actions
  • The basal ganglia output nuclei tonically inhibit
    the thalamic nuclei and the superior colliculus.
  • Released when input patterns excite principal
    neurons of the neostriatum.
  • Tonic activity regulated by striatal projections
    to the GPe via the GP (inhibitory principal
    neurons) and to the subthalamic nucleus
    (excitatory principal neurons) that increase the
    activity of the GPi and SNr neurons, producing a
    balanced opposition of activity.

38
Feedback in the Neostriatum
  • Plenz, Dietmar, (2003), "When inhibition goes
    incognito feedback interaction between spiny
    projection neurons in striatal function," TINS,
    26(8)436-443, August 2003.
  • This paper discusses how spiny projection neurons
    (the principal GABAergic neurons of the striatum)
    process cortical inputs in a highly parallel way.

39
Implications
  • Striatal dynamics are probably not 'winner take
    all'. Local depolarization facilitates the
    depolarization of nearby cells, so that
    behavioural sequences can be generated.
  • Plenz suggests the striatum could also function
    as a resistive grid that computes state
    transitions for movement trajectories. (See
    Connolly and Burns, 1993, "A model for the
    functioning of the striatum," Biological
    Cybernetics 68535-544.)
  • This is an important but unclear idea.

40
Conclusions
  • If you want to use reward learning in a system
    that generates behaviour, look at the
    Actor-Critic model.
  • If you want to build a biologically-inspired
    reward learning system, consider the basal
    ganglia as a model.
  • If you want to do the same for a trajectory
    prediction system, also consider modelling the
    basal ganglia.
Write a Comment
User Comments (0)