Title: More Pixels and Samples: High Resolution Media Streaming
1More Pixels and SamplesHigh Resolution Media
Streaming
Roger Zimmermann Data Management Research
Laboratory University of Southern CaliforniaLos
Angeles, CA 90089 http//dmrl.usc.edu
2Outline
- Motivation
- Background
- Remote Media Immersion
- Distributed Immersive Performance
- High-performance Data Recording Architecture
- Demonstration
- Conclusions
3Motivation
- The charter of the Integrated Media Systems
Center (IMSC) is Immersipresence - Immerse real (e.g. people) and virtual elements
into a common space - Becomes much more interesting in a distributed
environment - Many sub-problems tracking, gesture recognition,
data management, - Video and audio are an important component
4What is the problem?
- Live streaming is either
- Low to medium quality, or
- Very expensive, i.e., there are only a few people
to call - Other obstacles
- Complicated (not like the telephone)
- Often requires room engineering
- Network bandwidth is not available
- Some of the technical constraints can and will be
solved
5Ex. Network Infrastructure
- UTOPIA (Utah Telecommunications Open
Infrastructure Agency) public works project to
provide fiber to the home (FTTH). - SuperNet, Alberta, Canada. Public project to
provide a high speed Internet infrastructure. - NSF sponsored workshop, Oct. 23-24, 2003,
Chicago, Illinois. The importance of
broaderband networks is recognized.
6Research Timeline
2002
Jun 2-3
Unveiling of RMI Demonstration
Internet2 Meeting RMI Demonstration
Oct 29
Dec 28
DIP Experiment 1 Distributed Duet
2003
Recording from Stream
Jan 18
DIP Experiment 2 Remote Master Class
Jan 19
DIP Experiment 3 Duet with Audience
Jun 2-3
2004
APAN Meeting HYDRA Experiment
Jan 29
7What is the RMI?
The goal of the Remote Media Immersion system is
to build a testbed for the creation of immersive
applications.
- Immersive application aspects
- Multi-model environment (aural, visual, haptic,
) - Shared space with virtual and real elements
- High fidelity
- Geographically distributed
- Interactive
8RMI Challenges
- Immersive, high-quality video acquisition and
rendering - High Definition video 1080i and 720p (40 Mb/s)
- Immersive, high-quality audio acquisition and
rendering - 10.2 channels of uncompressed audio (12 Mb/s)
- Storage and transmission of media streams across
networks - Synchronization between streams (A/V, A/A, V/V)!
9RMI Architecture
10RMI Experimental Setup
- Synchronized immersive audio and HDTV streamed
playback from Yima server over Internet2 - 16 channels of immersive audio, uncompressed at
16 Mb/s - 1920x1080i HDTV content, MPEG-2 compressed at 40
Mb/s - Control of end-to-end process capturing, network
interface, transmission, rendering
11Internet2 Fall 02Member Meeting
Video HDTV 1280x720p
Audio 10.2 channel, immersive soundsystem
New World Symphony, Miami, FL
12Distributed Immersive Performance
- Outgrowth of Remote Media Immersion (RMI)
- Create seamless immersive environment for
distributed musicians, conductor (active) and
audience (passive) - Compelling relevance for any human interaction
scenario education, journalism, communications - Scenario
- Orchestra not available in town
- Famous soloist cannot fit travel into schedule
- Multiple soloists in different places
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
1760 ms
20 ms
40 ms
30 ms
10 ms
30 ms
Challenge network latency
18- Key observations
- Network latency maps to audio delay on stage
- Video delay is zero
- Challenge
- Synchronization
- Transmitting low latency video of conductor to
players and audience - Maintaining constant delay between players
Player 1
15m 45ms
15m 45ms
Conductor
Player 2
10m 30ms
19Barriers and Requirements
- 1. Real-time continuous media (CM) stream
transmission (network protocol) with low latency - 2. Precise timing GPS clock, synchronization
- 3. Data loss management error concealment, FEC,
retransmission, multi-path streaming - 4. Many-to-many transmission capability
- 5. Low latency, high-quality real-time video and
audio acquisition and rendering - 6. Real-time CM stream recording
- 7. User experiments, requirements,
specifications, performance evaluation
20Distributed Immersive Performancev.1.0-The Duet
- Experiments and Objectives
- Experimental testbed and demonstration system
- Demonstrate and document a distributed musical
performance with two musicians (a duet) - Two-way interactive video and 10.2 channel
immersive audio capability - Explore other applications involving passive and
active participants, such as two-site interactive
meetings - Evaluate technical barriers and psychophysical
effects of latency and fidelity on music and
other forms of human interaction between two
interconnected sites - Dennis Thurmond - USC Thornton School of Music
- Elaine Chew - USC Industrial and Systems
Engineering
21Distributed Immersive Performancev.1.0-The Duet
Linux PC
Linux PC
DV FireWire Camera
DV FireWire Camera
DV FireWire Camera
100BaseT campus net
100BaseT IMSC net
350 meters
Ramo Hall of Music (RHM 106)
Powell Hall (PHE 106)
- Video NTSC resolution, 31 Mb/s DV, software
decode, one-way latency 110 ms due to DV camera
compression lt 5 ms network - Audio uncompressed, 16 or more channels at 1
Mb/s each, one-way latency lt 10 ms due to audio
processing lt 5 ms network
22Distributed Immersive Performance v.1.0-The Duet
23HYDRA Streaming Architecture
- Most previous work in streaming media has focused
on the retrieval and playback functionality. - More and more devices directly output digital
media streams - E.g., camcorders (FireWire, USB,
SDI),microphones (Bluetooth), mobile handsets
(3G) - Need for a backend data stream recording
/playback system (Super TiVo) - HYDRA (High-performance Data Recording
Architecture) ICEIS 2003
24Challenges
- Variable bit rate media streams
- Multi-zoned disks
- Different read and writetransfer rates
25Live Streaming
- Latency is a crucial limiting factor
- Only 20-40 ms is unnoticeable (foruniversal
interactive applications) - Tradeoff Latency versus bandwidth
- Compression reduces bandwidth
- But high compression increases latency(e.g.,
interframe MPEG compression) - Approach
- Perform experiments within this design spacee.g.
DV NTSC resolution, 31Mb/s, SW/HW codecse.g.
uncompressed audio and video
26ArchitectureHYDRA HD Live Streaming
JVC HD10U
HD-SDI
RTP/ UDP/IP
VGA
Display
FireWire
MPEG-2 Decoder
MPEG TS Extractor
- Acquisition and rendering PC are both Linux based
(RH 9 includes kernel support for FireWire). - MPEG transport stream extraction.
- Data transport via UDP packets with single
retransmissions
27Rendering
- Solution 1 Software based rendering
- Use X11 hw acceleration XvMC (libmpeg2)
- Motion compensation and iDCT with GPU
- Our hw NVIDIA FX 5200 (100)
- Performance 90 fps _at_ 1280x720 with 3 GHz P4
28Rendering
- Issues with software rendering
- Precise timing 29.97 fps
- Decoding time for I, P, and B frames varies
- Buffering of decoded frames necessary to achieve
precise timing - Transport stream splitter and audio decoding
- Video card refresh rate (timing) is independent
of MPEG timing, but - Non-standard display modes are possible 720p on
Linux (16x9) - Decoding latency
29Rendering
- Solution 2 Hardware based rendering
- E.g. CineCast HD board from Vela Research
- Digital HD-SDI and analog RGB/YPrPb outputs
- Great and stable picture (but )
- Genlock input for synchronization
30Rendering
- Issues with hardware rendering
- Linux drivers hard to come by
- CineCast HD board uses SCSI interface
- Wrote our own SCSI extensions to the Linux SCSI
Generic driver (/dev/sg0) - Decoding latency requires 8 x 64 kB to start
decoding - Consumer HD cardTelemann HiPix (400)But No
Linux drivers(no Windows filters?) - New Vela cardCineCast HD LE
31Live HD Video Streaming (1280x720p)
32Distributed Immersive Performance v.2.0-Extended
Architecture
- Conflicting requirements Low latency and low
bandwidth (i.e., use of compression) - Solution - two-tier architecture
- Between performers
- Low latency stereo audio streaming
- Low latency video streaming
- Between performers and audience
- High definition video streaming
- Multichannel audio streaming (10.2 channel)
- Recording of all streams sychronously for
archival purposes and later playback.
33Multichannel audio Stereo audio Low latency, low
resolution video High latency, high resolution
video
Performer 1
Performer 2
Playback and Recording
Audience
34Thank You! Questions?
- More info at
- Data Management Research Lab
- http//dmrl.usc.edu
- Integrated Media Systems Center
- http//imsc.usc.edu
- Acknowledgments
- Kun Fu, Beomjoo Seo, Shihua Liu, Dwipal A. Desai,
Didi Shu-Yuen Yao, Mehrdad Jahangiri, Farnoush
Banaei-Kashani, Rishi Sinha, Hong Zhu, Nitin
Nahata, Sahitya Gupta, Vasan N. Sundar,