Title: Motivation
1Motivation
- Video Communication over
- Heterogeneous Networks
- Diverse client devices
- Various network connection
- bandwidths
- Limitations of Scalable Video
- Coding Schemes
- Limited layers supported
- No video format changes
- Video Transcoding Provides
- Dynamic Solutions
- Channel bandwidth adaptation
- Video coding format adaptation
2Challenges in Video Transcoding
- Improve Efficiency of Video Transcoding
- Large data volume
- High computational complexity
- Optimize Visual Quality for a Given Bit Rate
- Human vision system (HVS) based video transcoding
is desirable
3Proposed Solutions
- Exploit Foveation Property of the HVS in Video
Transcoding - Develop Fast Algorithms for Video Transcoding
- DCT-domain foveation filtering technique
- Fast algorithms for DCT-domain inverse motion
compensation - Local bandwidth constrained DCT-domain inverse
motion compensation - Look-up-table based DCT-domain inverse motion
compensation
4Foveation
- The Human Eye Samples Visual Field Non-uniformly
- The highest sampling resolution is at Fovea
- The sampling resolution decreases rapidly as away
from Fovea - Retinal Images are Inherently Non-uniform in
Spatial Resolution
Cells per degree
Eccentricity (deg)
5Foveation Modelling
- Foveated Contrast Threshold Geisler Perry 98
- Foveated Cut-off Frequency fc
- Spatial Frequencies Beyond
- the Cut-off Frequency is
- Invisible (Foveated Image)
- e2 Half-resolution eccentricity
- CT0 Minimum contrast threshold
- CT Contrast threshold
- f Spatial frequency (cyc/degree)
- e Retinal eccentricity(degree)
- a Spatial frequency decay constant
Image size 512 x 512 Unit of v image height
Local cut-off frequency (cyc/deg)
Pixel position relative to foveation point (unit
pixel)
6Foveated Images
Foveation point is marked by X
7Foveated Contrast Sensitivity Function (FCSF)
- Foveated Contrast Sensitivity Function (FCSF)
- Shape the Compression Distortion According to
FCSF
Image size 512 x 512 Viewing
distance 3 times the image height
Normalized contrast sensitivity of human eye
Distance from foveation point (unit pixel)
8Video Transcoding Architecture
- Open-Loop Video Transcoding
- Simple and fast
- Error drift
Transcoding Error Propagation
9Drift Free Video Transcoders
- Cascaded Pixel Domain Video Transcoding
- Low efficiency
- Long delay
- Fast Pixel Domain Video Transcoding
- Save motion estimation, one frame memory and one
IDCT operation - Fast DCT-Domain Video Transcoding
- No IDCT-DCT operations Lower data volume
- DCT-domain inverse motion compensation is complex
(Research topic)
Fast Pixel Domain Video Transcoder
Fast DCT Domain Video Transcoder
10Foveation Embedded DCT Domain Video Transcoding
11Foveation Filtering
- Pixel Domain Foveation Filtering Technique Lee,
99 - High computational complexity
12DCT-Domain Foveation Filtering
- DCT-Domain Block Mirror Filtering Rao, 90
- Pros
- Significantly simplified
- Combine with inverse quantization
- Easy to parallelize
- Cons
- Blocking artifacts
f
h
Filter Kernel
DCT of f
H. R. Sheikh, S. Liu, B. L. Evans and A. C.
Bovik, Real-Time Foveation Techniques for H.263
Video Encoding in Software, ICASSP 2001.
13Multipoint Video Conferencing
H. R. Sheikh, S. Liu, Z. Wang and A. C.
Bovik,Foveated Multipoint Videoconferencing at
Low Bit Rates, ICASSP 2002, accepted.
14 Simulation Results
Foveated video at 256 kb/s
Uniform resolution video at 256 kb/s
Foveation point is at the center of the
upper-left quadrant
15Foveation Point Selection
- Interactive Methods
- Mouse, eye tracker
- Reverse channel is assumed
- End to end delay is assumed short enough
- Automatic Methods
- Fixation points analysis (Very challenging)
- Application oriented methods
- DCT-Domain Human Face Detection Wang Chang,
97 - Skin color region segmentation
- Face template constraint
- Spatial Verification