Learning to perceive how handwritten digits were drawn - PowerPoint PPT Presentation

About This Presentation
Title:

Learning to perceive how handwritten digits were drawn

Description:

Two different ways to use backpropagation for handwritten digit recognition ... Use bilinear interpolation to distribute the ink at each point to the 4 closest pixels. ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 33
Provided by: hin9
Category:

less

Transcript and Presenter's Notes

Title: Learning to perceive how handwritten digits were drawn


1
Learning to perceive how hand-written digits
were drawn
  • Geoffrey Hinton
  • Canadian Institute for Advanced Research
  • and
  • University of Toronto

2
Good old-fashioned neural networks
Compare outputs with correct answer to get error
signal
Back-propagate error signal to get
derivatives for learning
outputs
bias
hidden layers
input vector
3
Two different ways to use backpropagation for
handwritten digit recognition
  • Standard method Train a neural network with 10
    output units to map from pixel intensities to
    class labels.
  • This works well, but it does not make use of a
    lot of prior knowledge that we have about how the
    images were generated.
  • The generative approach First write a graphics
    program that converts a motor program (a sequence
    of muscle commands) into an image.
  • Then learn to map pixel intensities to motor
    programs.
  • Digits classes are far more natural in the space
    of motor programs

These twos have very similar motor programs
4
A variation
  • It is very difficult to train a single net to
    recognize very different motor programs. So train
    a separate network for each digit class.
  • When given a test image, get each of the 10
    networks to extract a motor program.
  • Then see which of the 10 motor programs is best
    at reconstructing the test image.
  • Also consider how similar that motor program is
    to the other motor programs for that class.

5
A simple generative model
  • We can generate digit images by simulating the
    physics of drawing.
  • A pen is controlled by four springs.
  • The trajectory of the pen is determined by the 4
    spring stiffnesses at 17 time steps .
  • The ink is produced from 60 evenly spaced points
    along the trajectory
  • Use bilinear interpolation to distribute the ink
    at each point to the 4 closest pixels.
  • Use time-invariant parameters for the ink
    intensity and the width of a convolution kernel.
  • Then clip intensities above 1.

We can also learn to set the mass, viscosity, and
positions of the four endpoints for each image.
6
Some ways to invert a generator
  • Look inside the generator to see how it works and
    try to invert each step of the generative
    process.
  • Its hard to invert processes that lose
    information
  • The third dimension
  • The correspondence between model-parts and
    image-parts.
  • Define a prior distribution over codes and
    generate lots of (code, image) pairs. Then train
    a recognition neural network that does image ?
    code.
  • But where do we get the prior over codes?
  • The distribution of codes is exactly what we want
    to learn from the data!
  • Is there any way to do without a the prior over
    codes?

7
A way to train a class-specific model from a
single prototype
  • Start with a single prototype code
  • Learn to invert the generator in the vicinity of
    the prototype by adding noise to the code and
    generating (code, image) pairs for training a
    neural net.
  • Then use the learned model to get codes for real
    images that are similar to the prototype. Add
    noise to these codes and generate more training
    data.
  • This extends the region that can be inverted
    along the data manifold (with genetic jumps).

prototype

nearby datapoint
manifold of digit class in code space
8
An example during the later stages of training
About 2 ms
9
How training examples are created
used for training
code error
clean code
add noise
noisy code
predictedcode
biases prototype
hidden units
hidden units
generate
generate
recon-structed image
recon. error
training image
image
used for testing
10
How the perceived trajectory changes at the early
stages of learning
11
Typical fits to the training data at the end of
training
12
Typical fits to the training data at the end of
training
13
Typical fits to the training data at the end of
training
14
Typical fits to the training data at the end of
training
15
performance on test data
blue 2 red 3
16
The five errors
17
Performance on test data if we do not use an
extra trick
18
On test data, the model often gets the
registration slightly wrong
The neural net has solved the difficult global
search problem, but it has got the fine details
wrong. So we need to perform a local search to
fix up the details
19
Local search
  • Use the trajectory produced by the neural network
    as an initial guess.
  • Then use the difference between the data and the
    reconstruction to adjust the guess.
  • This means we need to convert residual errors in
    pixel space into gradients in trajectory space.
  • But how to we backpropagate the pixel residuals
    through the generative graphics model?
  • Make a neural network version of the generative
    model.

20
How the generative neural net is trained
We use the same training pairs as for the
recognition net
clean code
add noise
noisy code
generative hidden units
recognition hidden units
generate
recon-structed image
recon. error
training image
image
used for training
21
An example during the later stages of training
the generative neural network
22
How the generative neural net is used
initialize
update code
clean code
code gradients
current code
generative hidden units
recognition hidden units
generate
recon error
current image
real image
pixel error gradients
23
(No Transcript)
24
The 2 model gets better at fitting 2s
But it also gets better a fitting 3s
25
(No Transcript)
26
Improvement from local search
  • Local search reduces the squared pixel error of
    the correct model by 20-50
  • It depends on how well the networks are trained
  • The squared pixel error of the wrong model is
    reduced by a much smaller fraction
  • It often increases because the pixel residuals
    are converted to motor program adjustments using
    a generative net that has not been trained in
    that part of the space.
  • The classification error improves by 25-40

27
Why spring stiffnesses are a good language
  • If the mass is not where we thought it was during
    planning, the force that is generated is the
    planned force plus a feedback term that pulls the
    mass back towards where it should have been
  • The four spring stiffnesses define a quadratic
    energy function. This function generates an
    acceleration. The acceleration depends on where
    the mass is.

feedback term
planned position
iso-energy contour
planned force
28
A more ambitious application
  • Suppose we have a way to generate realistic
    images of faces from underlying variables that
    describe emotional expression, lighting, pose,
    and individual 3-D face shape.
  • Given a big database of unlabeled faces, we
    should be able to recover the state of the
    generator.

29
Conclusions so far
  • Computers are becoming fast enough to allow
    adaptive networks to be used in all sorts of
    novel ways.
  • The most principled way to do perception is by
    inverting a generative model.

30
THE END
31
An even more ambitious application
  • Suppose we have a model of a person in which the
    muscles are springs.
  • We are also given a prototype motor program that
    makes the model walk for a few steps.
  • Can we train a neural network to convert the
    current dynamic state plus a partial description
    of a desired walk into a motor program that
    produces the right behaviour for the next 100
    mille-seconds?

32
Some other generative models
  • We could use a B-spline with 8 control points to
    generate the curve (Williams et. al). This does
    not learn to fit the images nearly as well.
  • If we use 16 control points it ties itself in
    knots. Momentum is useful.
Write a Comment
User Comments (0)
About PowerShow.com