Title: Lecture: Image and Audio Analysis
1Lecture Image and Audio Analysis
IS 246Multimedia Information
Michael Smith UC Berkeley School of
Information Thursday 330 pm 630 pm Spring
2007 http//courses.ischool.berkeley.edu/i246/s07
/
2Updated Schedule
- Jan Feb. Film Art, Media Theory and Signals
Introduction - March 1 Film Art Summary, Intro to Content
Analysis, Assignment 1 Due - March 8 Computer Vision and Signal Processing in
Multimedia - March 15 Media Theory Revisited, New Media, Intro
to Video Production - March 22 Video and Audio Production Workshop,
Assignment 2 - April 5 Audio Information Retrieval and Media
Asset Management - April 12 Information Retrieval Continued, Video
Production Assignment Due - April 19 Media Communications, Intro to Social
Media - April 26 Future Media Social Media, Media
Aesthetics - May 3 Future Media Immersive Systems, 3D
modeling, Synthetics
3Media Work Flow
- Capture and Pre-Production
- Classical and advanced media technology
- Previsualization
- Editing and Post Production
- Film editing and new visualization technology
- Processing
- Media content analysis
- Distribution / Sharing
- Telecommunications and Social Media
4Audio and Image Signals
5Sampling
6Quantization
7Image Filtering
8Region Detection
- Figure 3.10, Multimodal Video Characterization
and Summarization (Smith 2004)
9Texture Functions Edge Detection
- Figure 3.11b, Multimodal Video Characterization
and Summarization (Smith 2004)
10MPEG 1 Motion Vectors
- B-P Frames Data
- Dependent on Encoder and GOP Pattern
11Pattern Matching for Motion
- Match template in Image 1 to location in Image 2
- Example with L1 Norm (Euclidean Difference)
Difference Window Wt(i,j)
12LP and HP Filtering
13Systems to Filters
- A system transforms a signal into a new signal or
a different signal representation -
- y(t) F( x(t) )
- Examples y(t) 2x(t) y(t) x(t)2 y(t)
x(t-2) - Typical Filter
- y(t) A1y(t-1) A2y(t-2) B0x(t) B1x(t-1)
x(t)
13
http//www.coe.montana.edu/ee/rmaher/ee477_SP04/le
cture/EE477_SP04_01_intro_files/frame.htm
14Image Filtering
- Low Frequency
- High Frequency
- Differential
¼ ¼ ¼ ¼ -1 -1 -1 -1 9 -1 -1 -1 -1 0
1 0 -½ 1 ½ 0 1 0
0 -1 -1 -1 0 -1 2 -4 2 -1 -1 -4 13
-4 -1 -1 2 -4 2 -1 0 -1 -1 -1 0
15Image Filtering
16Exercise
Differential Filter Original HPF
0 1 0 -½ 1 ½ 0 1 0 -1 -1 -1 -1 9
-1 -1 -1 -1
17Histograms and Smoothing
- Figure 3.7, Multimodal Video Characterization and
Summarization (Smith 2004)
18Readings for Today
- The course website contains the updated class
readings for March 8th. - http//courses.ischool.berkeley.edu/i246/s07/readi
ngs.html - Below are the discussion question assignments
- Thursday 03/08 Content Analysis
- Handouts from Chapter 3, Multimodal Video
Characterization and Summarization (All). - Davenport, G., Aguierre-Smith, T.G. and Pincever,
N. Cinematic Primitives for Multimedia. IEEE
Computer Graphics and Applications, 11 (4) pp.
67-74. (Matt Schutte). - Dorai, C. and Venkatesh, S. Computational Media
Aesthetics Finding Meaning Beautiful. IEEE
Multimedia, 8 (4) pp. 10-12. (Patrick Schmitz) - Davis, M. Media Streams An Iconic Visual
Language for Video Representation. in Baecker,
R.M., Grudin, J., Buxton, W.A.S. and Greenberg,
S. eds. Readings in Human-Computer Interaction
Toward the Year 2000, Morgan Kaufmann Publishers,
Inc., San Francisco, 1995 pp. 854-866. (Kevin
Lim and Zaven Demerjian). - A. W. M. Smeulders, M. Worring, S. Santini, A.
Gupta, and R. Jain. Content-Based Image Retrieval
at the End of the Early Years, IEEE Transactions
on Pattern Analysis and Machine Intelligence,
vol. 22, 2000 pp. 1349-1380. (Patrick Schmitz). - P. Aigrain, H. Zhang, and D. Petkovic.
Content-based Representation and Retrieval of
Visual Media A State-of-the-Art Review
Multimedia Tools and Applications, vol. 3, 1996
pp. 178-202. (Matt Ochmanek). - Fractals Presentation (Bernt Wahl).
19Davis Media Streams
20Smeulders CBIR
21Evolution Robotics
- SIFT Features
- David Lowe's SIFT keypoint detector (UBC 2003)
22Assignment for Next Week
- Edit Yourself
- Post 2 pictures with an edit effect on the class
wiki (or email to msmith_at_ischool.berkeley.edu) - Rotoscope, Edge Filter, Blur, Morph, etc
23Readings for March 15
- Computational Media Theory
- Eco, U. Articulations of the Cinematic Code. in
Nichols, B. ed., University of California Press,
Berkeley, 1976 pp. 590-607. (Zaven) - Metz, C. Film language a semiotics of the
cinema. University of Chicago Press, Chicago,
1991 pp. 92-107 pp. 108-146. (PS, Matt O.) - Eisenstein, S.M. Film Form Essays in Film
Theory. Harcourt Brace Jovanovich, Publishers,
San Diego, 1949 pp. 45-63. (Matt S.) - Recommended Davis, M. and Levitt, D. Time-Based
Media Processing System (US Patent 6,243,087),
Interval Research Corporation, USA, 2001 pp.
1-20. (Kevin) - Continued from Feb. 22
- Bordwell, D. and Thompson, K. Film Art An
Introduction. 8th edition pp. 218 - 263
(Continued). - Kuleshov, L. Kuleshov on Film Writings by Lev
Kuleshov. University of California Press,
Berkeley, 1974 pp. 41-55. - Isenhour, J.P. The Effects of Context and Order
in Film Editing. AV Communications Review, 23
(1) pp. 69-80. - Burch, N. Theory of film practice. Princeton
University Press, Princeton, N.J., 1981 pp.
3-16. - Barthes, R. Action Sequences. In Strelka, J. ed.,
Patterns of Literary Style. State University of
Pennsylvania Press, University Park,
Pennsylvania, 1971 pp. 5-14.
24Computer Vision Videos
- Virtualized Reality
- Object Tracking
25Previous MMM Projects
- Monkeybotster
- Recreation Evaluation Interface
- Lunchmeisters
- Recipe Box
- Wishter
- DARE
- HouseBuddies
26Recreation Evaluation Interface
- Recreation Evaluation Interface enables camera
phone users to see what's happening where they
are not, and let others in their community know
what's happening where they are - Using real-time data and a reference database,
this community can connect with other people with
similar interests to maximize their local
knowledge, enabling better decision-making about
how and where to spend one's time
27DARE
- DARE will leverage the camera phones as tools
that allow users to play games with and against
each other - The main objective of DARE is to bring different
social networks together informally, or as ice
breakers and team builders for corporations,
school groups and other such groups
28MMM2 Find The Mobile News
- Find the Mobile News is a location-based system
that aggregates photographs of news and other
events based on geography. It allows camera-phone
users to upload photos directly from their phones
to the system immediately, as news happens and
they capture it. Metadata with each photograph
describes the location, date, time, and
photographer of the picture optionally, users
can add free-text annotations as well. - Users in the community or around the world can
access photos on FTMN by any combination of
geographic location, date, time, photographer,
photographer rating, photograph rating, and
keywords. They can view the results of a search
on a map that indicates where each photo was
taken. Aggregation of pictures in real time
allows community-produced news to be published
instantly and on location.