Outline - PowerPoint PPT Presentation

About This Presentation

Title:

Outline

Description:

Outline – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 107

Provided by: xiuwe

Learn more at: http://www.cs.fsu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Outline

1
Outline

Syllabus
Introduction
Computational Paradigms for Vision
Appearance-based computer vision
Physics-based computer vision

2
Class Materials

In this class most of the time we will discuss
papers from the literature
At the beginning I will give a general
introduction based on chapters from different
books
There is no required textbook for this class

3
Vision

Vision
The process of acquiring knowledge about the
environmental objects and events by extracting
information from the light they emit or reflect
Vision is a very complicated process, involving
different processes such as memory
Vision is the most useful source for information
as about 50 of the human brain is devoted to
visual processing

4
Vision cont.

Vision has been studied from many different
perspectives
Computational vision
Emphasis on approaches that are biologically
plausible
Computer vision
Emphasis on algorithms to solve particular
problems
Statistical vision
Emphasis on developing and analyzing mathematical
and statistical models

5
Darwin X
Source NewScientist
6
Computer Vision

Computer vision tries to automate the vision
process by building devices that simulate the
human vision process
Note that devices that solve part of the problems
can be very useful

7
Motivation Examples

Computer vision techniques can provide novel
opportunities and improve performance of existing
systems (sometimes significantly)
Hopefully the following examples will convince you

8
Human Computer Interfaces

Mouse gestures
Allow one to control programs more easily by
drawing commands using mouse
Some of the 80 gestures recognized by strokeit
(http//www.tcbmi.com/strokeit/)

9
Mouse Gestures

In Photoshop, for example, you can
In a web browser, you can

10
Human-Computer Interactions
11
3D Hand Mouse
12
HandiEye
13
Sign Language Recognition
14
ALVINN
15
(No Transcript)
16
RALPH
17
Applications continued
18
DARPA Grant Challenge

http//www.darpa.mil/grandchallenge/gcorg/index.ht
ml

19
DARPA Grant Challenge
20
Introduction cont.

Honda ASIMO
http//world.honda.com/ASIMO/

21
Automated Map Updating
22
Automated Map Updating
23
3D Urban Models
24
Image-Guided Neurosurgery
25
Intracardiac Surgical Planning
26
Medical Image Analysis
27
Detection and Recognition
28
Detection and Recognition of Text in Natural
Scenes
29
Detection and Recognition of Text in Natural
Scenes
30
Text Detection and Recognition in Images and
Videos
31
Driver Monitoring System
32
Face Recognition
http//www.a4vision.com
33
Intelligent Transportation Systems
http//dfwtraffic.dot.state.tx.us/dal-cam-nf.asp
34
Handwritten Address Interpretation System

HWAI - http//www.cedar.buffalo.edu/HWAI/
The HWAI (Handwritten Address Interpretation)
System was developed at Center of Excellence for
Document Analysis and Recognition (CEDAR) at
University at Buffalo, The State University of
New York. It resulted from many years of research
at CEDAR on the problems of Address Block
location, Handwritten Digit/Character/Word
Recognition, Database Compression, Information
Retrieval, Real-Time Image Processing, and
Loosely-Coupled Multiprocessing.
The following presentation is based on the
demonstration pages at HWAI

35
Handwritten Address Interpretation System cont.

Step 1 Digitization

36
Handwritten Address Interpretation System Cont.

Step 2 Address Block Location

37
Handwritten Address Interpretation System Cont.

Step 3 Address Extraction

38
Handwritten Address Interpretation System Cont.

Step 4 Binarization

39
Handwritten Address Interpretation System Cont.

Step 5 Line Separation

40
Handwritten Address Interpretation System Cont.

Step 6 Address Parsing

41
Handwritten Address Interpretation System Cont.

Step 7 Recognition
(a) State Abbreviation Recognition

42
Handwritten Address Interpretation System Cont.

Step 7 Recognition
(b) ZIP Code Recognition

43
Handwritten Address Interpretation System Cont.

Step 7 Recognition
(c) Street Number Recognition

44
Handwritten Address Interpretation System Cont.

Step 8 Street Name Recognition

45
Handwritten Address Interpretation System Cont.

Step 9 Delivery Point Codes

46
Handwritten Address Interpretation System Cont.

Step 10 Bar coding

47
Military Applications

Unmanned Aerial Vehicles

48
Automated Global Monitoring
49
Approaches to Computer Vision

Vision is a complicated computational process
Try to simulate the human vision system
Try to build mathematical formulations of the
environment (to be perceived) and then perform
inference
Try to invent approximate but efficient short
cuts to the general vision problem

50
Neuroanatomy of the Brain
51
Visual Pathway
52
Visual Pathway Diagram
53
Eye-Camera Analogy

The eye is much like a camera
Both form an upside-down image by admitting light
through a variable-sized opening and focusing it
on a two-dimensional surface using a transparent
lens

54
Functions of Different Cells
55
Nobel Prize Winning Experiments
56
Nobel Prize Winning Experiments
57
Nobel Prize Winning Experiments cont.
58
Nobel Prize Winning Experiments cont.
59
Simple Cells in the Visual Cortex
60
Simple Cells

rectangular shaped receptive fields
segregated ON and OFF zones
respond to a bright or dark bar
represent a restricted region in the visual field
respond best to a specific orientation
non-optimally oriented stimuli will be
ineffective in stimulating the neuron

61
Complex Cells

larger receptive field than simple cells
orientation tuned
ON and OFF zones are mixed in the receptive field
respond well to a moving bar
direction selective

62
Hyper-complex Cells

receptive field is selective for the length of
the stimulus
similar to complex cell receptive fields
(orientation and direction selective)
selective for features of shape such as length
and width of the bar of light.

63
Brain Imaging
64
Psychophysical Studies

Determination of the relationship between the
magnitude of a sensation and the magnitude of the
stimulus that gave rise to that perceptual
sensation
By studying the perception to different stimuli,
one can guess what happened in the visual
system

65
Contrast sensitivity function
66
Single Channel or Multiple Channels
67
Neural Spatial Frequency Channels

Neural receptive fields are tuned to the spatial
frequency of the stimulus
There seems to be a range of neural spatial
frequency channels, each tuned to a different
spatial frequency
A spatial frequency channel can be adapted

68
Vision as an Inverse Problem

2-D images are generated by projecting 3-D world
onto an image plane under certain lighting
conditions and view angles
The images are a function of the 3-D object
surfaces and their surface properties
Vision essentially needs to solve an inverse
problem
Roughly the inverse of computer graphics

69
An Example
70
Physics-based Computer Vision

This naturally leads to the physics-based
computer vision
One needs to build computational models for image
formation process (computer graphics)
One needs to build representations of objects
Which includes surface and texture (color map)
Vision is essentially an algorithm to recover the
underlying three dimensional models of a given
image
A widely accepted framework is Bayesian inference

71
Face Recognition based on a 3D Model
72
Face Recognition based on a 3D Model cont.
73
Face Recognition based on a 3D Model cont.
74
Face Recognition based on a 3D Model cont.
75
Appearance-based Computer Vision

A different approach is to try to utilize the
resulting 2-D images directly
The images are treated as a matrix
One tries to make decisions based on the images
without building explicit 3-D models
Note that here computer vision is an application
of pattern recognition algorithms

76
Face detection using spectral histograms

The problem is to detect faces in images

77
Face detection using spectral histograms cont.
Preprocessing
78
Face detection using spectral histograms cont.
79
Face detection using spectral histograms cont.
80
Face detection using spectral histograms cont.
81
Face detection using spectral histograms cont.
82
Rotation-invariant face detection
83
Face detection using spectral histograms cont.
84
Object Detection and Recognition

Object detection and recognition problem
Given a set of images, find regions in these
images which contain instances of relevant
objects
Here the number of relevant objects is assumed to
be large
For example, the system should be able to handle
30,000 different kinds of objects, an estimate of
the humans capacity for basic level visual
categorization
Goal
Develop a system that achieves real-time
detection and recognition for images of size 720
x 480
At a frame rate of 30 frames per second (which is
the NTSC standard video stream)

85
A Framework
86
Requirements

To achieve real-time detection and recognition,
we need two critical components
A classifier that can reduce the average
classification time effectively
Features that can discriminate a large number of
objects and can be computed using a few
instructions

87
Lookup Table Decision Trees

We use local spectral histogram features that are
computed using histogram integral images
We build a decision tree by clustering
At each node, we reduce the dimension to a small
number, i.e., no more than 5 for detection and
recognition applications
We can approximate the decision from any of the
classifiers using a lookup table

88
Local spectral histogram features
89
Comparison of LSH and Haar features
90
Look-up Table Decision Trees
This requires clustering and we just use some
standard methods
91
An example path of a decision tree
92
Real-time detection and recognition cont.
93
Optimal component analysis

Linear representations are widely used in
appearance-based object recognition applications
Simple to implement and analyze
Efficient to compute
Effective for many applications

94
Standard linear representations

Principal Component Analysis
Designed to minimize the reconstruction error on
the training set
Fisher Discriminant Analysis
Designed to maximize the separation between means
of each class
Independent Component Analysis
Designed to maximize the statistical independence
among coefficients along different directions
A toy example
Standard representations give the worst
recognition performance

95
Optimal Component Analysis

Derive a performance function that is related to
the recognition performance
Formulate the problem of finding optimal
representations as an optimization one on the
Grassmann manifold
Use MCMC stochastic gradient algorithm for
optimization

96
Performance Measure - continued

Suppose there are C classes to be recognized
Each class has ktrain training images
It has kcross cross validation images

97
Performance Measure - continued

F(U) depends on the span of U but is invariant to
change of basis
In other words, F(U)F(UO) for any orthonormal
matrix O
The search space of F(U) is the set of all the
subspaces, which is known as the Grassmann
manifold
It is not a flat vector space and gradient flow
must take the underlying geometry of the manifold
into account

98
Kernel optimal component analysis
99
Kernel optimal component analysis
100
Kernel optimal component analysis
101
Kernel function and kernel parameter learning
102
Subset of a face dataset for visualization
103
Evolution of OCA learning
104
Performance comparison
105
Performance comparison on a full face dataset
106
Summary