Next Generation of System Architectures for TeleImmersive Environments presentation

About This Presentation

Transcript and Presenter's Notes

Title: Next Generation of System Architectures for TeleImmersive Environments

1
Next Generation of System Architectures for
Tele-Immersive Environments

Klara Nahrstedt
(klara_at_cs.uiuc.edu)
University of Illinois at Urbana-Champaign

2
Outline

Tele-immersive Environments
Challenges
Multi-camera Systems
Multi-display Systems
Internet2 Networking
Architectural Design Space
3D Compression, 3D Streaming, Adaptation
Case Studies Comparison between TEEVE and
Coliseum
Conclusion

3
Tele-Immersive Environments
(Application Model)
4
TI Goals

Investigate tools for improving the effectiveness
of remote collaborations using COTS components
for broader audience
Investigate architectures for tele-immersive
systems with 3D realism

Develop infrastructures, algorithms, systems and
technologies that support
Ease of use
Quality
Perceptual cues
Scalability
3D video and audio

5
Experimental Examples of Tele-immersive
Environments
TI in UIUC (Prof. Nahrstedt)
TI in UC Berkeley (Prof. Bajcsy)
Internet 2
6
Experimental Examples of Tele-immersive
Environments
TI in HP Single PC Coliseum System (Dr. Harlyn
Baker, Dr. Nina Bhatti, et al)
7
3D Camera Challenges

3D Camera
RGB and Depth yield 5 Bytes per pixel
Frame Size 640x480 pixels
Calibration very time consuming
Real-Time Capturing and Content Creation (15 Fps)
3D Camera needs one PC
Need FireWire Bus for Cameras to get high
throughput
3D Multi-camera Environment
10-20 3D Cameras
Camera Spacing Issue
Different spacing for physical activity than
collaborative conferencing
180 or 300 or 360 degree camera coverage

8
3D Content Challenges

Huge TI Frame Size
Macro-frame (e.g., 10 3D Frames in one time
instance)
TI Frame rate
15 Macro-frames per second
Huge bandwidth uncompressed
One 3D stream 640x480 (pixels/frame) x 5
(bytes/pixel) x15 (frames/sec) 23,040,000
bytes/second (approx. 23MBytes/sec)
One TI (10 3D Streams) 230,040,000 bytes/sec
(approx. 230MBytes/sec)
3D Compression
RGB can be delivered in lossy manner
Depth must be delivered in lossless manner

9
Networking Challenges

Multi-stream Coordination due to large bursts at
routers cant send all streams at the same
time, need spacing
Synchronization of multiple streams need to
send all streams within a constrained time
interval so that rendered frame can be done in
timely fashion
Decision about transport protocol UDP takes
small packet sizes, hence large context switch
overhead, TCP performs positive acknowledgement
algorithms, hence incur larger delays over lossy
Internet.
QoS Issues
Keep low end-to-end latency,
Need high throughput,
Need low or no loss rate,
Deal with power issues on 3D cameras in terms of
thermal problems,
Keep low synchronization skews
Deal with very heterogeneous networks (Gigabit
LAN, Internet2, 100 Mbps LAN)
Deal with heterogeneous computing platforms
(processors, OS, busses, )
Deal with scale (camera arrays, participants,
microphone arrays, )

10
Timing Challenges
11
Multi-streaming without Coordination
12
Rendering Challenges

Rendering in Real-Time
Consider between 5-15 3D streams with frame rate
of 10 fps and render them into one 4D stream with
10 fps
Need rendering station that must run at 50-150
fps with 640x480 pixels per frame, and 5 bytes
per pixel
Placement of Rendering Capabilities
Placement is possible at the sender side or at
the receiver side
Assistance in Display Management
Rendering may need to adapt based on viewer model
Feedback to Sender Side with meaningful
information
If rendering cant happen at the desired speed,
which streams to stop at the sender side, which
information to drop?

13
Display Challenges

Selection of scene views
Automatic computation of scene views
Visual information management
Selection of screen layouts
Mapping of camera views to display locations if
multi-view display is desired
Mapping of user feedback onto view/layout
selection

14
Next Generation of System Architectures Design
Choices

Methodology
We will consider two case studies, their system
architectures and their approaches to the
challenges
TEEVE system (UIUC/UCB) and Coliseum system
(HP/U.Colorado)
Other tele-immersive systems
U. North Caroline at Chapel Hill (H. Fuchs, Ketan
Mayer-Patel et al) VR TI Environment
Microsoft Research (Yong Rui et al) Video
Conferencing for Virtual Classroom
Design Choices
Architectural and Functional Choices
3D Camera Choices
Data Model Choices
Compression Choices
Streaming Protocol Choices
Rendering Choices
Display Choices

15
TEEVE Architecture
16
Coliseum Architecture (Complete Dataflow in a
Coliseum Session)
17
3D Camera Choices

TEEVE Cluster of Four 2D Cameras where 3
cameras are BW and 1 camera is RGB
Coliseum Five CCD cameras with CMOS imagers

18
Data Model

TEEVE
3D Data Model
Each camera cluster produces a depth image fij,
containing depth and color information per pixel
Each 3D cluster camera creates a 3D video stream
At time Tj the capturing tear has N 3D
reconstructed frames that constitute a
macro-frame Fjf1,j, f2,j, , fN,j to show the
same scene from different viewing angles
TEEVE operates N 3D video streams
Coliseum
2D Data Model
Each camera produces color information
All 5 2D streams get reconstructed and rendered
into one 3D stream.
The 3D stream includes RGB and alpha (depth)
information

19
3D Compression

TEEVE
Lossy Compression for RGB data
Scheme A color reduction
Scheme B Motion JPEG
Lossless (Run-length coding) compression for the
depth data
Coliseum
Lossy Compression for RGB data
MPEG4
Lossless RLE compression for alpha data

20
3D Video Streaming

TEEVE
End-to-End Multi-tier Streaming Protocols
Bandwidth Estimation
Rate Adaptation
TCP/IP protocol per 3D video stream to use the
large packet size and avoid frequent context
switching
Spacing and rate shaping at the gateway
Coordination among gateways to avoid drop of
packets at routers using token scheme
Coliseum
UDP for the 3D rendered video stream

Internet
Internet
21
Rendering Choices

TEEVE
Rendering happens at the receiver side
Reason allow for more inter and intra-stream
adaptation
Current maximum rendering is possible at 4
macro-frames per second with up to 12 3D camera
clusters, creating 12 3D streams, 640x480 pixels,
rendered together into a 4D video stream (i.e.,
the rendering processor must schedule up to 48
frames per second)
Coliseum
Rendering happens at the sender side, i.e., the
system renders all desired viewpoints from the
single 3D IBVH model of the local user and ships
them independently to the different participating
sites
Reason need all streams to reconstruct the 3D
stream (depth) and once reconstructed, it can be
rendered together
Current rendering is possible at 15 frames per
second with 5 CCD (firewire from Greypoint)
cameras, creating and rendering one 3D video
stream, 640x480 pixels

22
Stream/Display Mapping Choices (TEEVE)

Mapping of Streams onto displays based on
locality of cameras
Selection of Scene Views
Default scene views are automatically computed
using geometry of 3D camera locations
Selection of Screen Layout
Three parameters determine how scene views
rendered at view points are presented to users
Number of view points
Number of displays
Which view point is the focus-view point

23
Display Choice (Coliseum)

Use of single display since it is a single user
video conferencing environment with 3D realism
Single 3D view point of multiple users in a
conferencing setup
This was first version -- it later became all
prettified (molded plastic, retractable arms,
etc.)

24
Experimental Setup (TEEVE)

Metrics
Overall throughput of macro-frame Fj
Completion time interval for macro-frame Fj
Individual throughput of stream i
End-to-end Delay of a macro-frame Fj
Experimental parameters of remote testbed
Number of sender gateways 2
Number of receiver gateways 2
Number of 3D streams 12
Frame rate 4fps
UIUC-UCB
Equipment
Dell precision 450 (Dual Xeon processor with 1
GBytes memory), running Fedora Core 2
LAN 100 Mbps and Internet 2 between UCB and UIUC

25
Validation (TEEVE)
Macro-Frame Delay
Overall Throughput
Completion Interval
26
Validation of Compression (TEEVE)
Compression Time
Visual Quality
Scheme A
Before Compression
After Compression
Scheme B
27
Experimental Setup and Results (Coliseum)

Multi-PC IBVH, 733 MHz P3s (2-person conference)
8-10 Hz QVGA (224 x 224 displays / 7 computers)
Single-PC IBVH, Dual-processor 2GHz P4
17 Hz on VGA (300 x 300 displays)
In tests of up to 10 users
Observed Latency 250 ms
Network Latency 6 ms (local, 25 ms to Boulder)
Bandwidth (with MPEG) 620 Kbps per stream
Scales in work-conserving way with participants

28
Validation (Coliseum)

Overall Latencies
Cameras 20
Processing 50
Network lt5
Display 25

Coliseum is compute intensive, so we can control
end-node behavior and system performance
29
Conclusion

Tele-immersive environments are emerging, but to
make them cost-effective and high-performance,
need new and joint hardware and software
architectures
New 3D COTS cameras can be built but present
serious challenges to the vision, system and
network design
New 3D camera hardware needed
New vision software needed
New streaming protocols needed
New 3D compression algorithms needed
New multi-tier feedback schemes needed
New user view models needed
New timing and synchronization models needed

30
Acknowledgement

The TEEVE work is supported by NSF funding and it
is a joint work with
UIUC researchers Zhenyu Yang, Bin Yu, Jin Liang,
Yi Cui, Zahid Anwar, Robert Bocchino, Nadir
Kiyanclar, Professor Roy Campbell, William Yurcik
UCB researchers Professor Ruzena Bajcsy and Dr.
Sang-Hack Jung
We would like to acknowledge the usage of the
material from the ACM Multimedia 2003 paper about
the Coliseum System with further information from
Dr. Nina Bhatti, Hewlett Packard
Three publications
Z. Yang et al, Real-time 3D Video Compression
for Tele-Immersive Environments, SPIE/ACM
Multimedia Computing and Networking (MMCN),
January 2006, San Jose, CA
Z. Yang et al, TEEVE The Next Generation
Architecture for Tele-immersive Environments
IEEE International Symposium on Multimedia (ISM),
December 2005, Irvine, CA
H. Baker et al, Computation and Performance
Issues in Coliseum, An Immersive
Videoconferencing System, ACM Multimedia 2003,
Berkeley, CA
H. Baker et al. Understanding performance in
coliseum, an immersive video conferencing
system, ACM Transactions on Multimedia
Computing, Communications, and Applications, Vol.
1, 2005

Write a Comment

User Comments (0)

About PowerShow.com

Next Generation of System Architectures for TeleImmersive Environments PowerPoint PPT Presentation