MIT 6.899 Learning and Inference in Vision presentation

About This Presentation

Transcript and Presenter's Notes

Title: MIT 6.899 Learning and Inference in Vision

1
MIT 6.899 Learning and Inference in Vision

Prof. Bill Freeman, wtf_at_mit.edu
MW 230 400
Room 34-301
Course web page http//www.ai.mit.edu/courses/6.8
99/

2
Reading class

Well cover about 1 paper each class.
Seminal or topical research papers in the
intersection of machine learning and vision.
One student will present each paper. Then well
discuss the paper as a class.
One student will write a computer example
illustrating the papers main idea.

3
Learning and Inference

Learning learn the parameter values or
structure of a probabilistic model.
Look at many examples of people walking, and
build up probabilistic model relating video
images to 3-d motions.

Inference infer hidden variables, given a
observations.
Eg, given a particular video of someone walking,
infer their motions in 3-d.

4
Learning and Inference
5
Learning and Inference
Observed variables
Statistical dependencies between variables
Unobserved variables
Learning learn this model, and the form of
the statistical dependencies.
6
Learning and Inference
x1
x2
Unobserved variables
Learning learn this model, and the form of
the statistical dependencies.
Inference given this model, and the
observations, y1 y2, infer x1 x2, or their
conditional distribution.
7
Cartoon history of speech recognition research

1960s, 1970s, 1980s lots of different
approaches hey, lets try this.
1980s Hidden Markov Models (HMM), statistical
approach took off.
1990s and beyond HMMs now the dominant
approach. The person with the best training set
wins.

8
Same story for document understanding

The person with the best training set wins.

9
Computer vision is ready to make that transition

Machine learning approaches are becoming
dominant.
We get to make and watch the transition to
principled, statistical approach happen.
Its not trivial issues of representation,
robustness, generalization, speed,

10
Categories of the papers

Learning image representations
Learning manifolds
Linear and bilinear models
Learning low-level vision
Graphical models, belief propagation
Particle filters and tracking
Face and object recognition
Learning models of object appearance

11
1 Learning image representations
From http//www.amsci.org/amsci/articles/00article
s/olshausencap1.html
12
1 Learning image representations
From http//www.cns.nyu.edu/pub/eero/simoncelli01
-reprint.pdf
13
2 Learning manifolds
Joshua B. Tenenbaum, Vin de Silva, John C.
Langford
From http//www.sciencemag.org/cgi/content/full/
290/5500/2319
14
2 Learning manifolds
From http//www.sciencemag.org/cgi/content/full/
290/5500/2319
15
2 Learning manifolds
From http//www.sciencemag.org/cgi/content/full/
290/5500/2319
16
3 Linear and bilinear models
From http//www-psych.stanford.edu/jbt/NC120601.
pdf
17
4 Learning low-level vision
18
5 Graphical models, belief propagation
From http//www.cs.berkeley.edu/yweiss/nips96.pd
f
19
6 Particle filters and tracking
From http//www.robots.ox.ac.uk/ab/abstracts/ecc
v96.isard.html
20
7 Face and object recognition
From Viola and Jones, http//www.ai.mit.edu/people
/viola/research/publications/ICCV01-Viola-Jones.ps
.gz
21
7 Face and object recognition
From Viola and Jones, http//www.ai.mit.edu/people
/viola/research/publications/ICCV01-Viola-Jones.ps
.gz
22
7 Face and object recognition
From Pinar Duygulu, Kobus Barnard, Nando
deFreitas, and David Forsyth,
23
8 Learning models of object appearance
Images not containing the object
Images containing the object
Weber, Welling, and Perona, http//www.gatsby.ucl.
ac.uk/welling/papers/ECCV00_fin.ps.gz
24
8 Learning models of object appearance
Test images
Weber, Welling, and Perona, http//www.gatsby.ucl.
ac.uk/welling/papers/ECCV00_fin.ps.gz
25
8 Learning models of object appearance
Weber, Welling, and Perona, http//www.gatsby.ucl.
ac.uk/welling/papers/ECCV00_fin.ps.gz
26
Guest lecturers/discussants

Andrew Blake (Condensation, Oxford/Microsoft)
Baback Moghaddam (Bayesian face recognition,
MERL)
Paul Viola (Fast face recognition, MERL)

27
Class requirements

Read each paper. Think about them. Discuss in
class.
Present one paper to the class.
Present one computer example to the class.
Final project write a conference paper related
to vision and learning.

28
1. Read the papers, discuss them

Write down 3 insights about the paper that you
might want to share with the class in discussion.
Turn them in on a sheet of paper.

29
2. Presentations about a paper

About 15 minutes long. Set the stage for
discussions.
Review the paper. Summarize its contributions.
Give relevant background. Discuss how it relates
to other papers weve read.
Meet with me two days before to go over your
presentation about the paper.

30
3. Programming example

Present a computer implementation of a toy
example that illustrates the main idea of the
paper.
Show trade-offs in parameter settings, or in
training sets.
Goal help us build up intuition about these
techniques.
Ok to use on-line code. Then focus on creating
informative toy training sets.

31
Toy problems

Simple summaries of the main idea.
Identify an informative idea from the paper
Make a simple example using it.
Play with it.

32
Toy problem
by Ted Adelson
33
Toy problem
If you can make a system to solve this, Ill
give you a PhD
by Ted Adelson
34
Particle filter for inferring human motion in 3-d
From Hedvig Sidenbladhs thesis,
http//www.nada.kth.se/hedvig/publications/thesis
.pdf
35
Particle filter toy example
From Hedvig Sidenbladhs thesis,
http//www.nada.kth.se/hedvig/publications/thesis
.pdf
36
What well have at the end of the class
Code examples

Non-negative matrix factorization example
1-d particle filtering example
Boosting for face recognition
Example of belief propagation for scene
understanding.
Manifold learning comparisons.

37
4. Final project write a conference paper

Submitting papers to conferences, you get just
one shot, so its important to learn how to make
good submissions.
Well discuss many papers, and whats good and
bad about them, during the class.
Ill give a lecture on how to write a good
conference paper.
Subject of the paper can be
A project from your own research.
A project you undertake for the class.
Your idea
One I suggest to you

38
Feedback options

At the end of the course it would have been
better if we had done this
Somewhat helpful

During the course I find this useful I dont
find that useful
Very helpful

39
What background do you need?

Be able to read and understand the papers
Linear algebra
Familiarity with estimation theory
Image filtering
Background in machine learning and computer
vision.

40
Auditing versus credit

If youre a student and want to take the class,
sign up for credit.
Youll stay more engaged.
Makes it more probable that I can offer the class
again.
But if you do audit
Please dont come to class if you havent read
the paper.
I may ask you to present to the class, anyway.

41
First paper

Monday, Feb. 11.
Emergence of simple-cell receptive field
properties by learning a sparse code for natural
images, Olshausen BA, Field DJ (1996) Nature,
381 607-609
Presenter Bill Freeman
Computational demonstration need volunteer
(software is available http//redwood.ucdavis.ed
u/bruno/sparsenet.html)

42
Second paper

Wednesday, Feb. 13.
Learning the parts of objects by non-negative
matrix factorization, D. D. Lee and H. S. Seung,
Nature 401, 788-791 (1999), and commentary by
Mel.
Presenter need volunteer
Computational demonstration need volunteer

Write a Comment

User Comments (0)

About PowerShow.com

MIT 6.899 Learning and Inference in Vision PowerPoint PPT Presentation