Title: MIT 6.899 Learning and Inference in Vision
1MIT 6.899 Learning and Inference in Vision
- Prof. Bill Freeman, wtf_at_mit.edu
- MW 230 400
- Room 34-301
- Course web page http//www.ai.mit.edu/courses/6.8
99/
2Reading class
- Well cover about 1 paper each class.
- Seminal or topical research papers in the
intersection of machine learning and vision. - One student will present each paper. Then well
discuss the paper as a class. - One student will write a computer example
illustrating the papers main idea.
3Learning and Inference
- Learning learn the parameter values or
structure of a probabilistic model. - Look at many examples of people walking, and
build up probabilistic model relating video
images to 3-d motions.
- Inference infer hidden variables, given a
observations. - Eg, given a particular video of someone walking,
infer their motions in 3-d.
4Learning and Inference
5Learning and Inference
Observed variables
Statistical dependencies between variables
Unobserved variables
Learning learn this model, and the form of
the statistical dependencies.
6Learning and Inference
x1
x2
Unobserved variables
Learning learn this model, and the form of
the statistical dependencies.
Inference given this model, and the
observations, y1 y2, infer x1 x2, or their
conditional distribution.
7Cartoon history of speech recognition research
- 1960s, 1970s, 1980s lots of different
approaches hey, lets try this. - 1980s Hidden Markov Models (HMM), statistical
approach took off. - 1990s and beyond HMMs now the dominant
approach. The person with the best training set
wins.
8Same story for document understanding
- The person with the best training set wins.
9Computer vision is ready to make that transition
- Machine learning approaches are becoming
dominant. - We get to make and watch the transition to
principled, statistical approach happen. - Its not trivial issues of representation,
robustness, generalization, speed,
10Categories of the papers
- Learning image representations
- Learning manifolds
- Linear and bilinear models
- Learning low-level vision
- Graphical models, belief propagation
- Particle filters and tracking
- Face and object recognition
- Learning models of object appearance
111 Learning image representations
From http//www.amsci.org/amsci/articles/00article
s/olshausencap1.html
121 Learning image representations
From http//www.cns.nyu.edu/pub/eero/simoncelli01
-reprint.pdf
132 Learning manifolds
Joshua B. Tenenbaum, Vin de Silva, John C.
Langford
From http//www.sciencemag.org/cgi/content/full/
290/5500/2319
142 Learning manifolds
From http//www.sciencemag.org/cgi/content/full/
290/5500/2319
152 Learning manifolds
From http//www.sciencemag.org/cgi/content/full/
290/5500/2319
163 Linear and bilinear models
From http//www-psych.stanford.edu/jbt/NC120601.
pdf
174 Learning low-level vision
185 Graphical models, belief propagation
From http//www.cs.berkeley.edu/yweiss/nips96.pd
f
196 Particle filters and tracking
From http//www.robots.ox.ac.uk/ab/abstracts/ecc
v96.isard.html
207 Face and object recognition
From Viola and Jones, http//www.ai.mit.edu/people
/viola/research/publications/ICCV01-Viola-Jones.ps
.gz
217 Face and object recognition
From Viola and Jones, http//www.ai.mit.edu/people
/viola/research/publications/ICCV01-Viola-Jones.ps
.gz
227 Face and object recognition
From Pinar Duygulu, Kobus Barnard, Nando
deFreitas, and David Forsyth,
238 Learning models of object appearance
Images not containing the object
Images containing the object
Weber, Welling, and Perona, http//www.gatsby.ucl.
ac.uk/welling/papers/ECCV00_fin.ps.gz
248 Learning models of object appearance
Test images
Weber, Welling, and Perona, http//www.gatsby.ucl.
ac.uk/welling/papers/ECCV00_fin.ps.gz
258 Learning models of object appearance
Weber, Welling, and Perona, http//www.gatsby.ucl.
ac.uk/welling/papers/ECCV00_fin.ps.gz
26Guest lecturers/discussants
- Andrew Blake (Condensation, Oxford/Microsoft)
- Baback Moghaddam (Bayesian face recognition,
MERL) - Paul Viola (Fast face recognition, MERL)
27Class requirements
- Read each paper. Think about them. Discuss in
class. - Present one paper to the class.
- Present one computer example to the class.
- Final project write a conference paper related
to vision and learning.
281. Read the papers, discuss them
- Write down 3 insights about the paper that you
might want to share with the class in discussion. - Turn them in on a sheet of paper.
292. Presentations about a paper
- About 15 minutes long. Set the stage for
discussions. - Review the paper. Summarize its contributions.
Give relevant background. Discuss how it relates
to other papers weve read. - Meet with me two days before to go over your
presentation about the paper.
303. Programming example
- Present a computer implementation of a toy
example that illustrates the main idea of the
paper. - Show trade-offs in parameter settings, or in
training sets. - Goal help us build up intuition about these
techniques. - Ok to use on-line code. Then focus on creating
informative toy training sets.
31Toy problems
- Simple summaries of the main idea.
- Identify an informative idea from the paper
- Make a simple example using it.
- Play with it.
32Toy problem
by Ted Adelson
33Toy problem
If you can make a system to solve this, Ill
give you a PhD
by Ted Adelson
34Particle filter for inferring human motion in 3-d
From Hedvig Sidenbladhs thesis,
http//www.nada.kth.se/hedvig/publications/thesis
.pdf
35Particle filter toy example
From Hedvig Sidenbladhs thesis,
http//www.nada.kth.se/hedvig/publications/thesis
.pdf
36What well have at the end of the class
Code examples
- Non-negative matrix factorization example
- 1-d particle filtering example
- Boosting for face recognition
- Example of belief propagation for scene
understanding. - Manifold learning comparisons.
374. Final project write a conference paper
- Submitting papers to conferences, you get just
one shot, so its important to learn how to make
good submissions. - Well discuss many papers, and whats good and
bad about them, during the class. - Ill give a lecture on how to write a good
conference paper. - Subject of the paper can be
- A project from your own research.
- A project you undertake for the class.
- Your idea
- One I suggest to you
38Feedback options
- At the end of the course it would have been
better if we had done this - Somewhat helpful
- During the course I find this useful I dont
find that useful - Very helpful
39What background do you need?
- Be able to read and understand the papers
- Linear algebra
- Familiarity with estimation theory
- Image filtering
- Background in machine learning and computer
vision.
40Auditing versus credit
- If youre a student and want to take the class,
sign up for credit. - Youll stay more engaged.
- Makes it more probable that I can offer the class
again. - But if you do audit
- Please dont come to class if you havent read
the paper. - I may ask you to present to the class, anyway.
41First paper
- Monday, Feb. 11.
- Emergence of simple-cell receptive field
properties by learning a sparse code for natural
images, Olshausen BA, Field DJ (1996) Nature,
381 607-609 - Presenter Bill Freeman
- Computational demonstration need volunteer
(software is available http//redwood.ucdavis.ed
u/bruno/sparsenet.html)
42Second paper
- Wednesday, Feb. 13.
- Learning the parts of objects by non-negative
matrix factorization, D. D. Lee and H. S. Seung,
Nature 401, 788-791 (1999), and commentary by
Mel. - Presenter need volunteer
- Computational demonstration need volunteer