Title: Michael Arbib: CS664
1Michael Arbib CS664 Neural Models for Visually
guided behaviourUniversity of Southern
California, Fall 2001
- Lecture 6.
- The Mirror Neuron System Model (MNS) 1
2Visual Control of Grasping in Macaque Monkey
A key theme of visuomotor coordination parietal
affordances (AIP) drive frontal motor schemas
(F5)
AIP - grasp affordances in parietal cortex Hideo
Sakata
F5 - grasp commands in premotor cortex Giacomo
Rizzolatti
3F5 Motor Neurons
- F5 Motor Neurons include all F5 neurons whose
firing is related to motor activity. - We focus on grasp-related behavior. Other F5
motor neurons are related to oro-facial
movements. - F5 Mirror Neurons form the subset of
grasp-related F5 motor neurons of F5 which
discharge when the monkey observes meaningful
hand movements. - F5 Canonical Neurons form the subset of
grasp-related F5 motor neurons of F5 which fire
when the monkey sees an object with related
affordances.
4Mirror Neurons
Rizzolatti, Fadiga, Gallese, and Fogassi, 1995
Premotor cortex and the recognition of motor
actions Mirror neurons form the subset of
grasp-related premotor neurons of F5 which
discharge when the monkey observes meaningful
hand movements made by the experimenter or
another monkey. F5 is endowed with an
observation/execution matching system
5What is the mirror system (for grasping) for?
Mirror neurons The cells that selectively
discharge when the monkey executes particular
actions as well as when the monkey observes an
other individual executing the same action.
Mirror neuron system (MNS) The mirror neurons
and the brain regions involved in eliciting
mirror behavior.
Interpretations
- Action recognition
- Understanding (assigning meaning to others
actions) - Associative memory for actions
6Computing the Mirror System Response
- The FARS Model
- Recognize object affordances and determine
appropriate grasp. - The Mirror Neuron System (MNS) Model
- We must add recognition of
- trajectory and
- hand preshape
- to
- recognition of object affordances
- and ensure that all three are congruent.
- There are parietal systems other than AIP adapted
to this task.
7Further Brain Regions Involved
Axis and surface orientation
Spatial coding for objects, analysis of motion
during interaction of objects and self-motion
Detection of biologically meaningful stimuli
(e.g.hand actions) Motion related activity
(MT/MST part)
Mainly somatosensory Mirror-like responses
8cIPS cell response
Surface orientation selectivity of a cIPS cell
Sakata et al. 1997
9Key Criteria for Mirror Neuron Activation When
Observing a Grasp
- a) Does the preshape of the hand correspond to
the grasp encoded by the mirror neuron? - b) Does this preshape match an affordance of the
target object? - c) Do samples of the hand state indicate a
trajectory that will bring the hand to grasp the
object? - Modeling Challenges
- i) To have mirror neurons self-organize to learn
to recognize grasps in the monkeys motor
repertoire - ii) To learn to activate mirror neurons from
smaller and smaller samples of a trajectory.
10Infant grasping
in the light and in the dark
Onset of grasping
Onset of reaching
Stable grasping
Adult-like grasping
birth
Innate grasp reflex
Disappears
Mirror Neuron System formation
0 12 weeks 16 weeks 20 weeks 26 weeks 9 months
No anticipatory adjustment before contact with
the object.
In the period of 26 weeks to 9 months
Vision is not fully utilized for grasp
programming no anticipatory pre-shaping before
contact with the object. So it is hard to accept
that the mirror neuron system can compute the
compatibility of anothers hand motion to the
object in this period.
11Mirror Neuron Development Hypothesis
The development of the (grasp) mirror neuron
system in a healthy infant is driven by the
visual stimuli generated by the actions (grasps)
performed by the infant himself. The infant
(with maturation of visual acuity) gains the
ability to map other individuals actions into
his internal motor representation. In the MNS
model, the hand state provides the key
representation for this transfer. Then the
infant acquires the ability to create (internal)
representations for novel actions observed.
Parallel to these achievements, the infant
develops an action prediction capability (the
recognition of an action given the prefix of the
action and the target object)
12The Mirror Neuron System (MNS) Model
13- Implementing the Basic Schemas of the Mirror
Neuron System (MNS) Model - using Artificial Neural Networks
- (Work of Erhan Oztop)
- Hand State Core Mirror Circuit
- Visual Processing
- Reach and Grasp generation
14MNS Core Mirror Circuit and Hand State
15Opposition Spaces and Virtual Fingers
The goal of a successful preshape, reach and
grasp is to match the opposition axis defined by
the virtual fingers of the hand with the
opposition axis defined by an affordance of
the object (Iberall and Arbib 1990)
16What does it take for (monkey) MNS to work?
- Visual InputRecognition of hand (h) and object
(o) and their temporal and spatial relation (r).
- The set of all valid h o r combinations which
make up a grasp is very large (actually
infinite). It is impossible that the system
memorizes all such combinations and raises the
mirror response flag when a match occurs. - The visual information must be mapped to a lower
dimensional space which is simpler to handle. - This space has to capture the grasp type
discrimination and temporal and spatial relation
characteristics of the hand and object - The aperture(s) between fingers, the position of
the thumb with respect to palm are key to define
the hand configuration relevant for the grasp
recognition. - The disparity between the aperture axes and the
object grasp axis and the distance of the hand
to the object are key to define the relation of
the hand to the object.
17Hand State
- Our current representation of hand state defines
a 7-dimensional trajectory F(t) with the
following components - F(t) (d(t), v(t), a(t), o1(t), o2(t), o3(t),
o4(t)) - d(t) distance to target at time t
- v(t) tangential velocity of the wrist
- a(t) Aperture of the virtual fingers involved in
grasping at time t - o1(t) Angle between the object axis and the
(index finger tip thumb tip) vector relevant
for pad and palm oppositions - o2(t) Angle between the object axis and the
(index finger knuckle thumb tip) vector
relevant for side oppositions - o3(t), o4(t) The two angles defining how close
the thumb is to the hand as measured relative to
the side of the hand and to the inner surface of
the palm.
18Hand State components
- For most components we need to know (3D)
configuration of the hand.
19Assuming that we can compute the hand state
trajectory, how can we recognize it as a grasp
action ?
The general problem associate N-dimensional
space curves with object affordances A special
case The recognition of two (or three)
dimensional trajectories in physical space
Simplest solution Map temporal information into
spatial domain. Then apply known pattern
recognition techniques. Problem with simplest
solution The speed of the moving point can be a
problem! The spatial representation may change
drastically with the speed Scaling can overcome
the problem. However the scaling must be such
that it preserves the generalization ability of
the pattern recognition engine.
20A simple example of curve recognition
Curve recognition system demonstrated for hand
drawn numeral recognition (successful recognition
examples for 2, 8 and 3).
Spatial resolution 30 Network input size
30 Hidden layer size 15 Output size 5 Training
Back-propagation with momentum.and adaptive
learning rate
Sampled points
Point used for spline interpolation
Fitted spline
21Core Mirror Circuit as Neural Network
- With the assumptions
- Visual Information about the hand and the object
can be extracted - The information about the hand and the object
represented with the Hand State
- We can apply the curve recognition idea for the
core mirror circuit learning. Thus - We associate a 2 layer feed forward neural
network with the core mirror circuit - Then the learning task is given the 7
dimensional hand state trajectory, predict the
grasp action observed. -
22MNS Visual Processing
23Visual Processing for the MNS model
- How much should we attempt to solve ?
- Even though computers are getting more powerful
the vision problem in its general form is an
unsolved problem in engineering. - There exists gesture recognition systems for
human-computer interaction and sign language
interpretation - Our vision system must at least recognize
- 1) The Hand and its Configuration
- 2) Object features
- We attempt in (1)
24Simplifying the problem
- We simplifying the problem of recognizing the
Hand and its Configuration by using colored
patches on the articulation points of the hand. - If we can extract the patch positions reliably
then we can try to extract some of the features
that make up the hand state by trying to estimate
the 3D pose of the hand from 2D pose. - Thus we have 2 steps
- Extract the color marker positions
- Estimate 3D pose
25The Color-Coded Hand
- The Vision task is simplified using colored
tapes on the joints and articulation points - The First step of hand configuration analysis is
to locate the color patches unambiguously (not
easy!).
Use color segmentation. But we have to compensate
for lighting, reflection, shading and wrinkling
problems Robust color detection
26Robust Detection of the Colors RGB space
- A color image in a computer is composed of a
matrix of pixels triplets (Red,Green,Blue) that
define the color of the pixel. - We want to label a given pixel color as
belonging to one of the color patches we used to
mark the hand, or as not belonging to any class. - A straightforward way to detect whether a given
target color (R,G,B) matches the pixel color
(R,G,B) is to look at the squared distance
(R-R)2 (G-G)2
(B-B) 2with a threshold to do the
classification. - This does not work well, because the shading and
different lighting conditions effect R,G,B values
a lot and a our simple nearest neighbor method
fails. For example an orange patch under shadow
is very close to red in RGB space. - But we can do better Train a neural
network that can do the labeling for us
27Robust Detection of Colors the Color Expert
Create a training set using a test image by
manuallypicking colors from the image and
specifying their labels. Create a NN in our
case a one hidden layer feed-forward network -
that will accept the R,G,B values as input and
put out the marker label, or 0 for a non-marker
color. Make sure that the network is not too
powerful so that it does not memorize the
training set (as distinct from generalization) Tra
in it then Use it When given a pixel to
classify, apply the RGB values of the pixel to
the trained network and use the output as the
marker that the pixel belongs to. One then needs
a segmentation system to aggregate the pixels
into a patch with a single color label.
28Color Expert Summary
Color Expert (Network weights)
Preprocessing
Training phase A color expert is generated by
training a feed-forward network to approximate
human perception of color.
29Color Segmentation and Feature Extraction
Features
NN augmented segmentation system
Actual processing The hand image is fed to an
augmented segmentation system. The color decision
during segmentation is done by the consulting
color expert.
30Hand Configuration Extraction
Color Coded Hand
Feature Extraction
Step 1 of hand shape recognition system
processes the color-coded hand image and
generates a set of features to be used by the
second step
Model Matching
Step 2 The feature vector generated by the first
step is used to fit a 3D-kinematics model of the
hand by the model matching module. The resulting
hand configuration is sent to the classification
module.
Hand Configuration
31MNS Reach and Grasp generation
32Virtual Hand/Arm and Reach/Grasp Simulator
A precision pinch
A power grasp and a side grasp
33Kinematics model of arm and hand
- 19 DOF freedom Shoulder(3), Elbow(1), Wrist(3),
Fingers(42),
Thumb (3) - Implementation Requirements
- Rendering Given the 3D positions of links
start and end points, generate a 2D
representation of the arm/hand (easy) - Forward Kinematics Given the 19 angles of the
joints compute the position of each link (easy) - Reach Grasp execution Harder than simple
inverse kinematics since there are more
constraints to be satisfied (e.g. multiple target
positions to be achieved at the same time) - Inverse Kinematics Given a desired position in
space for a particular link what are the joint
angles to achieve the desired position (semi-hard)
34A 2D, 3DOF arm example
P(x,y)
c
Forward kinematics given joint angles A,B,C
compute the end effector position P X acos(A)
bcos(B) ccos(C) Y asin(A) bsin(B)
csin(C)
C
b
B
a
A
Radiusc
Inverse kinematics given joint position P there
are infinitely many joint angle triplets to
achieve
P(x,y)
b
Radius of the circles are a and c and the
segments connecting the circles are all equal
length of b
b
b
Radiusa
35A Simple Inverse Kinematics Solution
- Consider just the arm.
- The forward kinematics of the arm can be
represented as a vector function that maps joint
angles of the arm to the wrist position. - (x,y,z)F(s1,s2,s3,e) , where s1,s2,s3 are the
shoulder angles and e is the elbow angle. - We can formulate the inverse kinematics problem
as an optimization problem Given the desired P
(x,y,z) to be achieved we can introduce the
error function - J (P-F(s1,s2,s3,e))
- Then we can compute the gradient with respect to
s1,s2,s3,e and follow the minus gradient to reach
the minimum of J. - This method is called to Jacobian Transpose
method as the partial derivatives of F
encountered in the above process can be arranged
into the transpose of a special derivative matrix
called the Jacobian (of F).
36Power grasp time series data
aperture angle 1 x angle 2 ?
1-axisdisp1 ?1-axisdisp2 ? speed ?
distance.
37A single grasp trajectory viewed from three
different angles
How the network classifies the action as a power
grasp. Empty squares power grasp output filled
squares precision grasp crosses side grasp
output
The wrist trajectory during the grasp is shown by
square traces, with the distance between any two
consecutive trace marks traveled in equal time
intervals.
38Power and precision grasp resolution
Note that the modeling yields novel predictions
for time course of activity across a population
of mirror neurons.
39Spatial Perturbation Experiment with trained
core mirror circuit
Figure A. A regular precision grasp (the hand
spatially coincides with the target). Figure B.
The response of the network as precision
grasp. Figure C. The target object is displaced
to create a fake grasp. Figure D. The response
of the network to action in Figure C. The
activity of the precision mirror neuron is
reduced. In the graphs the x axes represent the
normalized time (0 for start of grasp, 1 for the
contact with object) and y axes represent the
cell firing rate.
A
B
C
D
40Kinematics Alteration Experiment with the
trained core mirror circuit
A
Figure A. A regular precision grasp (the wrist
has a bell shaped velocity profile). Figure B.
The velocity profile is (almost) linear. Figure
C. Classification of the action in Figure A as
precision grasp. Figure D. The activity vanished
during the observation of action Note that the
scales of the graphs C and D are different.
B
Normalized speed
Normalized time
Firing rate
Firing rate
C
D
D
E
Normalized time
Normalized time
41Research Plan
- Development of the Mirror System
- Development of Grasp Specificity in F5 Motor and
Canonical Neurons - Visual Feedback for Grasping A Possible
Precursor of the Mirror Property - Recognition of Novel and Compound Actions and
their Context - The Pliers Experiment Extending the Visual
Vocabulary - Recognition of Compounds of Known Movements
- From Action Recognition to Understanding Context
and Expectation
42Modeling Challenges
How can MNS be plugged into a learning-by-imitatio
n system with faith to biological constrains (BG,
Cerebellum, SMA, PFC etc..) How does the brain
handle temporal data? Transform the learning
network into a one which can work directly on
temporal data. Eliminate the preprocessing
required before the input can be applied to MNS
core circuit. Extend the action to be recognized
beyond simple grasps. Model the complementary
circuit, learning to grasp by trial and
error. And a lot more!
43Experimental Challenges
What are poor mirror neurons coding? - temporal
recognition codes - transient response to actions
which are not exactly the preferred stimuli How
can we relate different cells responses to each
other? - Fix the condition and record from as
many as possible cells with the exactly the same
condition. Is it possible to record from mirror
cells in different age groups of monkeys ( i.e.
infant to adult)?