Title: 16721: Learningbased Methods in Vision
116-721 Learning-based Methods in Vision
- Staff
- Instructor Alexei (Alyosha) Efros (efros_at_cs),
4207 NSH - TA Jean-Francois Lalonde (jlalonde_at_cs), A521
NSH - Web Page
- http//www.cs.cmu.edu/efros/courses/LBMV07/
2Today
- Introduction
- Why This Course?
- Administrative stuff
- Overview of the course
- Image Datasets
- Projects / Challenges
3A bit about me
- Alexei (Alyosha) Efros
- Relatively new faculty (RI/CSD)
- Ph.D 2003, from UC Berkeley (signed by Arnie!)
- Research Fellow, University of Oxford, 03-04
- Teaching
- I am still learning
- The plan is to have fun and learn cool things,
both you and me! - Social warning I dont see well
- Research
- Vision, Graphics, Data-driven stuff
4PhD Thesis on Texture and Action Synthesis
Smart Erase button in Microsoft Digital Image
Pro
Antonio Criminisis son cannot walk but he can
fly?
5Why this class?
- The Old Days
- 1. Graduate Computer Vision
- 2. Advanced Machine Perception
6Why this class?
- The New and Improved Days
- 1. Graduate Computer Vision
- 2. Advanced Machine Perception
- Physics-based Methods in Vision
- Geometry-based Methods in Vision
- Learning-based Methods in Vision
7The Hip Trendy Learning
Describing Visual Scenes using Transformed
Dirichlet Processes. E. Sudderth, A. Torralba,
W. Freeman, and A. Willsky. NIPS, Dec. 2005.
8Learning as Last Resort
9Learning as Last Resort
- EXAMPLE
- Recovering 3D geometry from single 2D projection
- Infinite number of possible solutions!
from Sinha and Adelson 1993
10Learning-based Methods in Vision
- This class is about trying to solve problems that
do not have a solution! - Dont tell your mathematician frineds!
- This will be done using Data
- E.g. what happened before is likely to happen
again - Google Intelligence (GI) The AI for the
post-modern world! - Why is this even useful?
- Even a decade ago at ICCV99 Faugeras claimed it
wasnt!
11The Vision Story Begins
- What does it mean, to see? The plain man's
answer (and Aristotle's, too). would be, to know
what is where by looking. - -- David Marr, Vision (1982)
12Vision a split personality
- What does it mean, to see? The plain man's
answer (and Aristotle's, too). would be, to know
what is where by looking. In other words, vision
is the process of discovering from images what is
present in the world, and where it is. -
- Answer 1 pixel of brightness 243 at position
(124,54) - and depth .7 meters
- Answer 2 looks like bottom edge of whiteboard
showing at the top of the image - Which Do we want?
- Is the difference just a matter of scale?
13Measurement vs. Perception
14Brightness Measurement vs. Perception
15Brightness Measurement vs. Perception
Proof!
16Lengths Measurement vs. Perception
Müller-Lyer Illusion
http//www.michaelbach.de/ot/sze_muelue/index.html
17Vision as Measurement Device
Real-time stereo on Mars
Physics-based Vision
Virtualized Reality
Structure from Motion
18but why do Learning for Vision?
- What if I dont care about this wishy-washy
human perception stuff? I just want to make my
robot go! - Small Reason
- For measurement, other sensors are often better
(in DARPA Grand Challenge, vision was barely
used!) - For navigation, you still need to learn!
- Big Reason
- The goals of computer vision (what where) are
in terms of what humans care about.
19So what do humans care about?
slide by Fei Fei, Fergus Torralba
20Verification is that a bus?
slide by Fei Fei, Fergus Torralba
21Detection are there cars?
slide by Fei Fei, Fergus Torralba
22Identification is that a picture of Mao?
slide by Fei Fei, Fergus Torralba
23Object categorization
sky
building
flag
face
banner
wall
street lamp
bus
bus
cars
slide by Fei Fei, Fergus Torralba
24Scene and context categorization
slide by Fei Fei, Fergus Torralba
25Rough 3D layout, depth ordering
26Challenges 1 view point variation
Michelangelo 1475-1564
27Challenges 2 illumination
slide credit S. Ullman
28Challenges 3 occlusion
Magritte, 1957
29Challenges 4 scale
slide by Fei Fei, Fergus Torralba
30Challenges 5 deformation
Xu, Beihong 1943
31Challenges 6 background clutter
Klimt, 1913
32Challenges 7 object intra-class variation
slide by Fei-Fei, Fergus Torralba
33Challenges 8 local ambiguity
slide by Fei-Fei, Fergus Torralba
34Challenges 9 the world behind the image
35In this course, we will
Take a few baby steps
36Goals
- Read some interesting papers together
- Learn something new both you and me!
- Get up to speed on big chunk of vision research
- understand 70 of CVPR papers!
- Use learninig-based vision in your own work
- Try your hand in a large vision project
- Learn how to speak
- Learn how think critically about papers
37Course Organization
- Requirements
- Paper Presentations (50)
- Paper Presenter
- Paper Evaluator
- Class Participation (20)
- Keep annotated bibliography
- Ask questions / debate / flight / be involved!
- Final Project (30)
- Do something with lots of data (at least 500
images) - Groups of 1 or 2
38Paper Advocate
- Pick a paper from list
- That you like and willing to defend
- Sometimes I will make you do two papers, or
background - Meet with me before starting, to talk about how
to present the paper(s) - Prepare a good, conference-quality presentation
(20-45 min, depending on difficulty of material) - Meet with me again 2 days before class to go over
the presentation - Office hours at end of each class
- Present and defend the paper in front of class
39Paper Evaluator
- For some papers, we will have Evaluators
- Sign up for a paper you find interesting
- Get the code online (or implement if easy)
- Run it on a toy problem, play with parameters
- Run it on a new dataset
- Prepare short 10-15 min presentation detailing
results - Discuss the paper critically
40Class Participation
- Keep annotated bibliography of papers you read
(always a good idea!). The format is up to you.
At least, it needs to have - Summary of key points
- A few Interesting insights, aha moments, keen
observations, etc. - Weaknesses of approach. Unanswered questions.
Areas of further investigation, improvement. - Submit your thoughts for current paper(s) at the
end of each class (printout)
41Class Participation
- Be active in class. Voice your ideas, concerns.
- You need to participate
- JF will be watching and keeping track!
42Final Project
- Can grow out of paper presentation, or your own
research - But it needs to use large amounts of data!
- 1-2 people per project.
- Project proposals in a few weeks.
- Project presentations at the end of semester.
- Results presented as a CVPR-format paper.
- Hopefully, a few papers may be submitted to
conferences.
43End of Semester Awards
- We will vote for
- Best Paper Presenter
- Best Paper Evaluator
- \Best Project
- Prize dinner in a nice restaurant
44Course Outline
45Datasets