Title: DARPA Compare
1(No Transcript)
2Multimedia Michael Christel Alex Hauptmann Rong
Jin (TA) http//www.cs.cmu.edu/alex/mmCourse
3How to get in touch with us
- Mike Christel
- christel_at_cs.cmu.edu
- http//www.cs.cmu.edu/christel
- (412)268-7799 or x8-7799
- WeH5212
- Alex Hauptmann
- alex_at_cs.cmu.edu
- http//www.cs.cmu.edu/alex
- (412)268-1448 or x8-1448
- WeH5124
- Office Hours by Appointment
4Teaching Assistant
- Rong Jin
- jin_at_andrew.cmu.edu
- Office WeH5316
- Office hours by appointment
- (412)268-4050 or x8-4050
5Course Outline, Part 1 of 3
- More details at www.cs.cmu.edu/alex/mmCourse
- October 22 Intro to Multimedia
- October 25 Multimedia Enabling Technologies,
Macromedia Flash Intro and Demo - October 29 Sound Processing, Speech Recognition
- November 1 Digital Video Creation and
Transmission - November 5 Speech Synthesis
6Course Outline, Part 2 of 3
- More details at www.cs.cmu.edu/alex/mmCourse
- November 8 Image Processing
- November 12 Digital Music and Music Processing
- November 15 Multimedia Internet Protocols, SMIL
- November 19 Synthetic Interviews A Multimedia
Company (Experiences from the Field) - November 22 Programming for Interactive
Multimedia (CGI Scripts/ASP)
7Course Outline, Part 3 of 3
- More details at www.cs.cmu.edu/alex/mmCourse
- November 29 Content Analysis and Coding of
Digital Audio and Video, Multimedia Storage
and Retrieval Management. - December 3 Video Retrieval Evaluation and
Testing Multimedia Interface Design, Digital
Libraries - December 6 Visual Design, Multimedia Interface
Design Guidelines, Multimedia use in the
future (Experience on Demand) - December 10 Multimedia as Entertainment
Technology, Virtual Reality
8(No Transcript)
9Homeworks
- See http//www.cs.cmu.edu/alex/mmCourse
- 9 Homeworks planned, 10 points each
- One hard homework will be worth 20 points
- No final, no midterm
- Publish homeworks on your web page - email us URL
- Space?
10Today Intro to Multimedia
Apple Knowledge Navigator Vision 1988
11Multimedia
Audio
Networking
Psychology
Natural Language Processing
Video
Storage Systems
Information Retrieval
Data Compression
Images
HCI
CPU Power
12Definition of Multimedia
- Multi (latin multus - numerous)
- Media, medium (latin medius, medium middle,
center, intermediary latin mediat intermediary,
means) - Multiple types of information captured, stored,
manipulated, transmitted, and presented. - Specifically Images, Video, Audio (Speech) and
Text
13Definition of Multimodal
- Multi (latin multus - numerous)
- Modal (latin modus manner)
- Traditionally refers to input/output formats
- Input
- sounds, speech (mike)
- gestures (camera, tablet)
- eye-gaze (camera),
- mouse,
- keyboard
- Output
- sounds, speech
- video
- Pictures
- Animations
- Text
14Perceived Information
- Physical Variables
- Sound is a waveform
- An image is a waveform
- light is electromagnetic radiation with different
intensity in spatial coordinates - color corresponds to wavelength
15History of Multimedia I
- Analog signals to sensors
- E.g. vinyl records
- Fidelity is faithfulness to the original
- Digital representation (60s)
- Sampling
- Quantizing
- Coding
- codec, modem, (A/D and D/A)
16Hardware Advances
- CPU
- Bus
- Network I/O
- Keyboard, Mouse
- Disk
- Mike A/D Board
- Camera A/D Board
- Speakers ( D/A Board)
- Display
17History of Multimedia II
- Analog controls only
- Special hardware (Displays, Scanners, FFTs)
- Integrated hardware components
- Further Integration
- Other devices
18History of Multimedia III
- Limiting Factors
- Storage Limits
- CPU Speeds
- I/O Speeds
- Network Bandwidth
19Why Digital?
- Universal storage, transmission format
- CD, internet
- Precision (Range of values, number of bits,
floating point) - Lossless transmission/storage
- BUT
- sampling rate distorts information
- size requirements may be large compared to
analog
20Digitization Process
- Sampling from an analog signal
- Sampling Errors relate to signal frequencies
- Quantization Errors
21Text
- ASCII, Unicode
- Formatted Text, Rich Text
- Document Formats
- Structured Tex, HTML
- Page Descriptions Postscript, PDF
22Graphics
- Objects
- circles, splines, rectangles, lines
- Editable
- resize, reshape, move, colorize
- Synthetic
23Images (Pictures)
- Fixed digitized representation
- bitmap, colors per pixel
- Editable in limited ways
- retouch, cut and paste, remap colors, filter
Photoshop tools - no model of the thing
- Captured
- not just from real life, clip art, screen dump
24Audio
- Sounds
- hear 15 Hz to 20 kHz
- Speech is 50 Hz to 10 kHz
- Speech Recognition
- It is hard to wreck a nice beach
- Ice cream I scream
- Synthesis
- Speech
- Music
- MIDI for 127 instruments, 47 percussion sounds
- Notes, timing
25Speech Recognition Issues
- Continuous vs Discrete
- Vocabulary Size
- Channel (Microphone)
- Environment (Location of mike and Speaker)
- Speaker Dependent/Speaker Independent
- Context (Language Model)
- Interactivity (Dialog Model)
26Speech Recognition Knowledge Sources
27Speech Variations
Style Variations careful, clear, articulated,
formal, casual spontaneous, normal,
read, dictated, intimate
Voice Quality breathy, creaky, whispery,
tense, lax, modal
Speaking Rate normal, slow, fast, very fast
Context sport, professional, interview, free
conversation, man-machine dialogue
Stress in noise, with increased
vocal effort (Lombard reflex), emotional factors
(e.g. angry), under cognitive load
28Video
- Frames comprise the video
- Frame rate delay between successive frames
- minimal change between frames
- Sequencing creates the illusion of movement
- gt 16 fps is smooth
- Standards 29.97 is NTSC, 25 is PAL, 60 is HDTV
- Interlacing
- Display scan rate is different
- monitor refresh rate
- 60 - 70 Hz ( 1/s)
29Captured vs. Synthetic
- Animation vs Video
- Graphics vs Pictures
- Synthesizer vs Recording
- Storage? Manipulation? Processor Requirements?
- Fidelity to real world
- Hybrids are possible
30Why is Multimedia Important?
- Our society -
- captures its experience,
- records its accomplishments,
- portrays its past
- informs its masses
- in pictures, audio and video
- For many, CNN has become the publication of
record - Multimedia learning leverages multiple
intelligences Gardner, 1993 - Multimedia Digital libraries are an essential
component of - formal, informal, and professional learning
- distance education, telemedicine
31Technology Push vs Market Pull
- Home Entertainment
- Catalog Ordering
- Multimedia Training, Education
- Videoconferencing
- Professional Video Services
- Videomail
- Speech Recognition
32Hype vs. Reality
- What is feasible, under what circumstances?
- What is possible?
- What is impossible?
- What is unlikely?
33Multimedia Visions
- DARPA Dominate the Battle Space
- HP 1995
- LSI Flash Point
- HP Synergies
34Intro to Multimedia Thats all for today
35(No Transcript)