Title: CSC2535: Advanced Machine Learning Lecture 11b Adaptation at multiple time-scales
1CSC2535 Advanced Machine LearningLecture
11bAdaptation at multiple time-scales
2An overview of how biology solves search problems
- Searching for good combinations can be very slow
if its done in a naive way. - Evolution has found many ways to speed up
searches. - Evolution works too well to be blind. It is being
guided. - It has discovered much better methods than the
dumb trial-and-error method that many biologists
seem to believe in.
3Some search problems in Biology
- Searching for good genes and good policies for
when to express them. - To understand how evolution is so efficient, we
need to understand forms of search that work much
better than random trial and error. - Searching for good policies about when to express
muscles. - Motor control works much too well for a system
with a 30 mille-second feedback loop. - Searching for the right synapse strengths to
represent how the world works - Learning works much too well to be blind trial
and error. It must be doing something smarter
than just randomly perturbing synapse strengths.
4A way to make searches work better
- In high-dimensional spaces, it is a very bad idea
to try making multiple random changes. - Its impossible to learn a billion synapse
strengths by randomly changing synapses. - Once the system is significantly better than
random, almost all combinations of random changes
will make it worse. - It is much more effective to compute a gradient
and change things in the direction that makes
things better. - Thats what brains are for. They are devices for
computing gradients. What of?
5A different way to make searches work better
- It is much easier to search a fitness landscape
that has smooth hills rather than sharp spikes. - Fast adaptive processes can change the fitness
landscape to make search much easier for slow
adaptive processes.
6An example of a fast adaptive process changing
the fitness landscape for a slower one
- Consider the task of drawing on a blackboard.
- It is very hard to do with a dumb robot arm
- If the robot positions the tip of the chalk just
beyond the board, the chalk breaks. - If the robot positions the chalk just in front of
the board, the chalk doesnt leave any marks. - We need a very fast feedback loop that uses the
force exerted by the board on the chalk to stop
the chalk. - Neural feedback is much too slow for this.
7A biological solution
- Set the relative stiffnesses of opposing muscles
so that the equilibrium point has the tip of the
chalk just beyond the board. - Set the absolute stiffnesses so that small
perturbations from equilibrium only cause small
forces (this is called compliance). - The feedback loop is now in the physical system
so it works at the speed of shockwaves in the
arm. - The feedback in the physics makes a much nicer
fitness landscape for learning how to set the
muscle stiffnesses.
8The energy landscape created by two opposing
muscles
Physical energy in the opposing springs
start
Location of board
Location of endpoint
The difference of the two muscle stiffnesses
determines where the minimum is. The sum of the
stiffnesses determines how sharp the minimum is.
9Two fitness landscapes
- System that directly specifies joint angles
- System that specifies spring stiffnesses
fitness
fitness
neural signals
neural signals
10Objective functions versus programs
- By setting the muscle stiffnesses, the brain
creates an energy function. - Minimizing this energy function is left to the
physics. - This allows the brain to explore the space of
objective functions (i.e. energy landscapes)
without worrying about how to minimize the
objective function. - Slow adaptive processes should interact with fast
ones by creating objective functions for them to
optimize. - Think how a general interacts with soldiers. He
specifies their goals. - This avoids micro-management.
11Generating the parts of an object
square
pose parameters
sloppy top-down activation of parts
clean-up using lateral interactions specified by
the layer above.
parts with top-down support
Its like soldiers on a parade ground
12Another example of the same principle
- The principle Use fast adaptive processes to
make the search easier for slow ones. - An application Make evolution go a lot faster by
using a learning algorithm to create a much nicer
fitness landscape (the Baldwin effect). - Almost all of the search is done by the learning
algorithm, but the results get hard-wired into
the DNA. - Its strictly Darwinian even though it achieves
most of what Lamark wanted.
13A toy example to explain the idea
- Consider an organism that has a mating circuit
containing 20 binary switches. If exactly the
right subset of the switches are closed, it mates
very successfully. Otherwise not. - Suppose each switch is governed by a separate
gene that has two alleles. - The search landscape for unguided evolution is a
one-in-a-million spike. - Blind evolution has to build about a million
organisms to get one good one. - Even if it finds a good one, that combination of
genes will be almost certainly be destroyed in
the next generation by crossover.
14Guiding evolution with a fast adaptive
process(godless intelligent design -)
- Suppose that each gene has three alleles ON,
OFF, and leave it to learning. - ON and OFF are decisions hard-wired into the DNA
- leave it to learning means that on each
learning trial, the switch is set randomly. - Now consider organisms that have 10 switches
hard-wired and 10 left to learning. - One in a thousand will have the correct
hard-wired decisions, and with only about a
thousand learning trials, all 20 switches will be
correct.
15The search tree
Evolution can ask learning Am I correct so far?
Evolution 1000 nodes Learning 999,000 nodes
99.9 of the work required to find a good
combination is done by learning. A learning trial
is MUCH cheaper than building a new organism.
16The results of a simulation (Hinton and Nowlan
1987)
- After building about 30,000 organisms, each of
which runs 1000 learning trials, the population
has nearly all of the correct decisions
hard-wired into the DNA. - The pressure towards hard-wiring comes from the
fact that with more of the correct decisions
hard-wired, an organism learns the remaining
correct decisions faster. - This suggests that learning performed almost all
of the search required to create brain structures
that are currently hard-wired.
17Using the dynamics of neural activity to speed up
learning
- A Boltzmann machine has an inner-loop iterative
search to find a locally optimal interpretation
of the current visible vector. - Then it updates the weights to lower the energy
of the locally optimal interpretation. - An autoencoder can be made to use the same trick
It can do an inner loop search for a code vector
that is better at reconstructing the input than
the code vector produced by its feedforward
encoder. - This speeds the learning if we measure the
learning time in number of input vectors
presented to the autoencoder (Ranzato, PhD
thesis, 2009).
18Major Stages of Biological Adaptation
- Evolution keeps inventing faster inner loops to
make the search easier for slower outer loops - Pure evolution each iteration takes a lifetime.
- Development each iteration of gene expression
takes about 20 minutes. The developmental
process may be optimizing objective functions
specified by evolution (see next slide) - Learning each iteration takes about a second.
- Inference In one second, a neural network can
perform many iterations to find a good
explanation of the sensory input.
19The three-eyed frog
- The two retinas of a frog connect to its tectum
in a way that tries to satisfy two conflicting
goals - 1. Each point on the tectum should receive inputs
from corresponding points on the two retinas. - 2. Nearby points on one retina should go to
nearby points on the tectum. - A good compromise is to have interleaved stripes
on the tectum. - Within each stripe all cells receive inputs from
the same retina. - Neighboring stripes come from corresponding
places on the two retinas.
20What happens if you give a frog embryo three eyes?
- The tectum develops interleaved stripes of the
form LMRLMRLMR - This suggests that in the normal frog, the
interleaved stripes are not hard-wired. - They are the result of running an optimization
process during development (or learning). - The advantage of this is that it generalizes much
better to unforeseen circumstances. - It may also be easier for the genes to specify
goals than the details of how to achieve them.
21The next great leap?
- Suppose that we let each biological learning
trial consist of specifying a new objective
function. - Then we use computer simulation to evaluate the
objective function in about one second. - This creates a new inner loop that is millions of
times faster than a biological learning trial. - Maybe we are on the brink of a major new stage in
the evolution of biological adaptation methods.
We are in the process of adding a new inner loop - Evolution, development, learning, simulation
22THE END