Video Coding - PowerPoint PPT Presentation

1 / 116
About This Presentation
Title:

Video Coding

Description:

Video Coding 18 – PowerPoint PPT presentation

Number of Views:388
Avg rating:3.0/5.0
Slides: 117
Provided by: MsWi4
Category:

less

Transcript and Presenter's Notes

Title: Video Coding


1
Video Coding
  • 18

2
Introductions
  • Inter-frame redundancy
  • HVS low pass filter
  • very insensitive to high
  • spatial and temporal
  • frequency

3
Introductions
  • Conditional Replenishment
  • Motion-compensated Coding
  • 3-D Transform Coding
  • Hierarchical Coding
  • MPEG-X

4
Conditional Replenishment
5
Conditional Replenishment
  • 1 bpp with SNR31.02 dB

6
3-D Transformation
7
3-D Transformation
  • 0.5 bpp with SNR30.04 dB

8
Motion Compensated Coding
  • Block diagram

9
Motion Compensated Coding
  • Motion estimation

10
Motion Compensated Coding
  • Exhaustive search

11
Motion Compensated Coding
  • Logarithmic search

12
Motion Compensated Coding
  • Three step search

13
Motion Compensated Coding
  • Hierarchical search - Zenith

14
Motion Compensated Coding
  • 0.125 bpp with SNR35.8 dB

15
MPEG-X
  • MPEG-1 (ISO/IEC 11172, Nov 92)
  • Compression standard for progressive frame-based
    video in SIF(360?240), targeted at 1.5 Mbits/s
  • 1.2 Mbits/s for video, 250 Kbits/s for audio
  • Applications VCD, MP3
  • MPEG-2 (ISO/IEC 13818, Nov 94)
  • Compression standard for interlaced frame-based
    video in CCIR-601(720?480) and high definition
    format(1920 ?1088), wide range of bit rates 4 to
    80 Mbits/s
  • Optimized around 4 Mbits/s
  • Applications DVD, HDTV Studio, and etc

16
MPEG-X
  • MPEG-4 (ISO/IEC 14496, Oct 98)
  • Multimedia standard for object-based video for
    nature or synthetic source
  • Coding for various bandwidth (5 Kbps 270 Mbps)
  • Applications Internet, cable TV, 3G wireless
    communication, and etc
  • MPEG-7 (ongoing)
  • Multimedia content description interface
  • Applications Internet, video search engine,
    digital library
  • Specify only bitstream syntax and decoding

17
Why does company work in the standards?
  • Interoperability
  • War of formats
  • VHS vs. Beta DVIX vs. DVD
  • Patent Royalties
  • Licensing fee for an MPEG-2 box is US 4 from
    MPEGLA
  • Total licensing fee for DVD is around US 10
  • Big companies can avoid being taxed by other
    companies
  • 250 Millions per year for RCA patent profiles
  • Create new markets
  • VCD Video Compact Disk
  • DVD Digital Versatile Disk
  • DBS Direct Broadcast System
  • HDTV Grand Alliance in the US, DVB in Europe

18
MPEG-1
  • Requirements (Part)
  • Coding of generic video with good quality (about
    VHS video) at 1 to 1.5 Mbits/s
  • Random access to a frame in limited time
  • Frequent access points
  • Fast forward/reverse
  • Seek and play in FF/FR using access points
  • System supporting audio-visual synchronized
    play/access
  • A practical/implementable decoder
  • And etc

19
MPEG-1
  • New Features (w.r.t H.261)
  • Flexible picture sizes, picture rates, etc
  • Picture size up to 4096?4096 supported
  • Normally at 360?240
  • Picture rates 23.976, 24 (movies), 25 (PAL),
    29.97, 30, 50, 59.94, 60
  • Bi-directional motion compensation
  • Half-pel motion compensation
  • VLC for MV difference
  • And etc

20
MPEG-1
  • Motion compensation
  • Same as H.263
  • Half-pel resolution for motion vectors
  • Differential coding of motion vectors
  • Motion compensation on 16?16 luminance blocks
  • Motion vectors divided by 2 for chrominance
  • Different from H.263
  • VLC for MV difference

21
MPEG-2
  • Requirements (Part)
  • ITU-R 601 interlaced video with high quality at
    49 Mbits/s
  • Scalable video coding for multi-quality video
    applications
  • Maximum interoperability/compatibility with
    MPEG-1
  • Support coding of non-interlaced and interlaced
    formats of many frame rates
  • Support video formats of various aspect ratios
  • And etc

22
MPEG-2
  • NEW features
  • Allow 422 and 444 formats for chrominance
  • Frame-pictures and field pictures
  • Frame/field/dual-prime adaptive motion
    compensation
  • New VLC table for DCT coefficients
  • Nonlinear quantization table
  • Slice always start and end at the same raw of
    macroblocks
  • Motion vectors always coded in half-pel
  • And etc

23
MPEG-2
  • Chrominance Sampling

24
MPEG-2
25
MPEG-2
  • Coding of interlaced video
  • Frame-pictures or field pictures
  • Motion compensation
  • Frame prediction for frame-pictures frame
    motion vectors
  • Same as MPEG-1
  • Field prediction for field pictures field
    motion vectors
  • Field prediction for frame-pictures
  • Prediction from either field of the previous
    frame
  • Good for fast motion
  • Dual-prime two predictions are formed for each
    field from the two recent fields. They are then
    averaged for final prediction
  • Field-pictures or frame-pictures
  • Only for P-pictures
  • 16?8 MC for field pictures - topbottom halves of
    each macroblock

26
MPEG-1/2
  • Data Structure M3, N15 (tradeoff on M)

27
MPEG-1/2
  • I-frames
  • No temporal redundacy reduction
  • Has the highest bit count
  • For random access, FF, REW features
  • P-frames
  • Forward motion-compensated prediction
  • B-frames
  • Both forward and backward motion-compensated
    prediction
  • Usually results in the lowest bit count
  • Increase delay

28
MPEG-1/2
  • I-frame
  • JPEG DCT like
  • Highest data rate
  • Random access
  • FF/FR

29
MPEG-1/2
  • P-frame
  • motion compensated
  • coding
  • Predicted by previous
  • I or P frame
  • Prediction error is then
  • coded and transmitted

30
MPEG-1/2
  • B-frame
  • delay
  • buffer
  • lowest
  • data rate
  • Higher coding
  • efficiency
  • No error propagation

31
MPEG-1/2
intraframe processing
Variable Length Encoder
Buffer Control Strategy
Predictive frame processing
interpolative frame processing
32
MPEG-1/2 Results
Bit rate (Mbits/s) SIF-30 CVGA CCIR 601 29.97 FPS VGA HDTV 29.97 FPS HDTV 60 FPS SVGA
pels 352 720 1920 1280
Lines 240 480 1080 720
Original bit rates (Mbps) 30.4 121.5 745.7 663.6
1.1 Mbps Good Poor
4.0 Mbps Excellent Good
9.0 Mbps Excellent Excellent
18.0 Mbps Excellent Good Good
28.0 Mbps Excellent Excellent
33
MPEG-1 Demo
  • Original Y(720?480), UV(360 ?480)
  • Acer
  • original 20 60 96
  • Bike
  • original 33 49 66 81 97
  • Foot
  • original 33 50 66 108 109
  • Table Tennis
  • Original 40 49 98 147 150

34
MPEG-2 Demo
35
TV, Telecom Computer Convergence
  • Past
  • Television RCA, TCI, GI etc.
  • Telecommunications ATT, Hughes etc.
  • Computer IBM, Microsoft etc.
  • Now
  • Television
  • Hugues Thomson (RCA) Sarnoff -gt DIRECTV
  • Microsoft Philips -gt WebTV
  • Telecommunications
  • ATT TCI _at_home -gt Local Telephone Service
    Internet
  • Computer
  • Window CE TV set-top box -gt Venus

36
MPEG-4
  • What existing standards can do
  • MPEG-1 Frame-based non-interlaced video (1.5
    Mbps)
  • MPEG-2 Frame-based interlaced video (4 Mbps
    270 Mbps)
  • H.261 Low bit rate video conference (64?p Kbps)
  • H.263 Very low bit rates video conference (10
    Kbps)
  • What the existing standards can not do
  • Coding of video object with content information
    (Metadata)
  • Coding of images for progressive transmission
  • Coding of multimedia information for various
    bandwidths and media (5 Kbps 270 Mbps)
  • Interactive

37
What applications are relevant to us?
38
Internet Image RetrievalJPEG vs. EZW
After 1 second
After 4 seconds
After 8 seconds
39
Multiresolution Feature Search Using Wavelet
40
Internet Commerce Using Metadata
41
Telemedicine Using Wavelet Compression
42
Consumer Videophone - Modes Applications
Family / Home
Stand-alone wired videophone
PC-based videophone
POTS/ISDN
LAN
Good spatio-temporal Quality Low end-to-end
Delay Channel Error Resilience
LAN ISDN
Virtual classroom Virtual meeting
43
Virtual Set Example
44
MPEG-4 Virtual Set Compositing
45
MPEG-4 Virtual Set
46
MPEG-4 Key Functionalities
Compression
Content-based interactivity
Universal access
47
MPEG-4 Key Functionalities
  • Content-based interactivity
  • A scene is composed of audio-visual objects
  • Not just pixels or moving blocks
  • Objects can be of different nature
  • Text or images
  • Rectangular or arbitrary shape
  • 2D or 3D objects
  • Natural or synthetic
  • Different coding schemes applied to different
    objects
  • Composer puts objects back in a scene

48
MPEG-4 Key Functionalities
  • Universal access
  • Robustness in error-prone environments
  • Allow applications over wired and wireless
    networks
  • Robust for severe error conditions, e.g. long
    error bursts
  • Content-based scalability
  • Allowing scalability in content, quality and
    complexity
  • Achieving content based scaling of visual
    information

49
MPEG-4 Key Functionalities
  • Compression
  • Improved coding efficiency
  • MPEG-4 video shall provide subjectively better
    visual quality at comparable bit rates compared
    to existing or emerging standards
  • 5-64 Kbps for mobile applications
  • Up to 20 Mbps for TV/film applications
  • Coding of multiple concurrent data streams
  • Can code multiple views of a scene efficiently,
    e.g. stereo video

50
MPEG-4 AV Objects
  • Audiovisual Scene is composed of objects (AV)
  • Compositor puts objects in scene (AV, 23D)
  • Objects can be of different nature
  • natural or synthetic AV, text graphics,
    animated faces, arbitrary shapes or rectangular
  • Encoding the object independently
  • Coding scheme can differ for individual objects
  • From low bitrates to (virtually) lossless quality

51
Object Functionalities in MPEG-4
52
Video object planes (VOPs)
  • Imagedifferent objectstextbackground (VOPs)
  • Single VOP backward compatible to MPEG-1/2
  • Composition or segmentation
  • Note segmentation is outside the scope of
    MPEG-4
  • Characteristics of VOP
  • Separate object coding
  • Separate object manipulation
  • May have different spatial and temporal
    resolutions
  • May be associated with different degrees of
    accessibility sub-VOPs
  • May be separated or overlapping

53
Separated and overlapping VOP
54
Content-based object manipulation
  • Change of the spatial position of a VOP
  • Application of a spatial scaling factor to a VOP
  • Change of the speed with which a VOP moves
  • Insertion of new VOPs
  • Delete of an video object (VO) in the scene
  • Successive VOPs belonging to the same physical
    objects in a scene are referred as vider objects
    (VO)
  • Change of the scene area

55
Example of bit stream manipulation (1)
56
Example of bit stream manipulation (2)
57
Segmentation process
  • Depending on applications, segmentation can be
    performed
  • Online (real-time) or offline (non-real time)
  • Automatic or semi-automatic
  • Examples
  • Video conferencing
  • Real-time, automatic
  • Separate foreground (communication partner) from
    background
  • Object tacking in video
  • May allow off-line and semi-automatic
  • Separate moving object from others

58
Video object plan formation
  • Rectangular or arbitrary

59
Demo
  • Segmentation
  • Bit stream manipulation

60
Video object-based coding
  • Each Video Object in a Scene is Coded and
    Transmitted Separately

61
Data structure in visual part of MPEG-4
Visual object Sequence (VS)
Video Object (VO)
Video Object Layer (VOL) Still Object Layer (SOL)
SOL0
Group of Video Object Plane (GOV)
Video Object Plane (VOP)
62
MPEG-4 Video Decoder
63
MPEG-4 Video Decoder
64
MPEG-4 Video Decoder
  • Scene description is necessary.
  • A language called the Binary Format for Scenes
    (BIFS) based on the Virtual Reality Modeling
    Language (VRML) has been developed by MPEG for
    scene description.
  • The decoder can use the scene description and
    additional input from the user to combine or
    compose the objects to reconstruct the original
    scene or create a variation on it.

65
MPEG-4 standard video tools
66
MPEG-4 standard video tools
  • The glue that will bind these tools together is
    the MPEG-4 systems description language (MSDL)
    which will have several components, including
  • Definitions for the interfaces between the coding
    tools,
  • A mechanism to combine coding tools,
  • A mechanism to download new tools.
  • The MSDL will transmit to the decoder the bit
    stream and the manner in which the tools have to
    be used at the decoder to reconstruct the audio
    and video.

67
Shape coding tool
  • Every VOP is coded macroblock by macroblock
  • The bounding rectangle of the VOP is extended on
    the right-bottom side to multiples of 16x16
    blocks (macroblock).

68
Description of VOP
69
Shape Coding
  • Binary alpha planes (shape information) are
    encoded by context-based arithmetic encoding
    (CAE).
  • Gray scale alpha planes are encoded by motion
    compensated DCT similar to texture coding.
  • An alpha plane is bounded by a rectangle that
    includes the shape of a VOP.
  • Intra (I-VOPs and P-VOPs) or inter (P-VOPS and
    B-VOPs) shape coding at macro block level
  • Inter motion compensated shape

70
Shape Coding (cont.)
  • Motion vectors from texture motions or shape
    motion of neighboring blocks
  • Coding modes
  • Opaque
  • Transparent
  • No-update
  • Intra Context based Arithmetic Encoding
  • Inter Context based Arithmetic Encoding
  • Lossless
  • Lossy
  • Motion compensation without update
  • Sub-sampling by factor 2 or 4

71
Shape Coding Tools - CAE
  • Context based Arithmetic Encoding (CAE) of the
    pixel ?
  • Intra
  • Inter

72
Shape Coding Tools - CAE
  1. Compute a context number.
  2. Index a probability table using the context
    number
  3. Use the indexed probability to drive an
    arithmetic encoder.

73
Motion Compensated DCT
  • A hybrid coding scheme used in H.261, H.263,
    MPEG-1 and MPEG-2
  • Reduces the temporal/spatial correlation of video
    objects in two steps
  • Temporal by motion compensation
  • Spatial by Discrete Cosine Transform (DCT)
    transform coding.

74
Block Based Motion Compensation
  • Models transversal motion of block in frames with
    a motion vector.
  • Motion compensation is performed block by block.

75
Motion Compensation Tools
Motion compensated coding modes (I, B, P)
76
Motion Compensation Tools - Motion Computation
77
Motion Compensation Tools - Padding
Process of normal padding of a block
Process of padding of a VOP
Process of extended padding of a block
78
Padding
79
Motion estimation compensation
80
Block-based compatibility for VOP
81
Texture Coding Tools (1/3)
  • The intra VOPs, as well as residual errors after
    motion compensated prediction, are coded using
    DCT on 8?8 blocks, in a manner similar to that
    employed in MPEG-1, MPEG-2, H.261, and H.263.
  • Backward compatible to MPEG-1 and MPEG-2
  • Efficient prediction of DC and AC coefficients
    for intra and inter-coded blocks can also be
    employed (this approach is not available in
    MPEG-1 and MPEG-2).

82
Texture Coding Tools (2/3)
83
Texture Coding Tools (3/3)
84
Adaptive DC prediction (texture coding)
Block (8x8)
85
Adaptive AC prediction (texture coding)
86
Coefficients Scanning (texture coding)

Alternate-Horizontal scan

Alternate-Vertical scan

zig-zag scan (H.263/MPEG-1)
87
Quantization (texture coding)
  • Method 1 Similar to that of H.263
  • Method 2 Similar to that of MPEG-2
  • Optimized non-linear quantization of DC
    coefficients
  • Quantization matrices and loading mechanism

88
Scalability
  • Object scalability
  • Achieved by the data structure used and the shape
    coding
  • Temporal scalability
  • Achieved by generalized scalability mechanism
  • Spatial scalability
  • Achieved by generalized scalable mechanism

89
Object scalability
90
Temporal scalability
91
Spatial scalability
92
Static Sprite Coding Tools (1/5)
  • A sprite is an image composed of pixels belonging
    to a video object visible throughout a video
    segment.
  • For instance, sprite generated from a panning
    sequence will contain all the visible pixels of
    the background object throughout the sequence.
  • Portions of this background may not be visible in
    certain frames due to the occlusion of the
    foreground objects or the camera motion.
  • Thus, the sprite contains all parts of the
    background that were at least visible once.

93
Static Sprite Coding Tools (2/5)
  • The sprite encoding syntax can be utilized for
    the transmission of any still image to the
    decoder since a sprite is essentially just a
    still image.
  • Static sprites are those that are directly copied
    (including appropriate warping and cropping) to
    generate a particular rendition of the sprite at
    a particular time instant.
  • Sprite the panoramic view of the back ground.
  • Improves the coding efficiency for video
    sequences with lots of revisiting backgrounds.

94
Static Sprite Coding Tools (3/5)
The main idea of static sprite coding technique
is to generate the reconstructed VOPs by directly
warping the quantized sprite using specified
motion parameters. Residual error between the
original VOP and the warped sprite is not added
to the warped sprite.
95
Static Sprite Coding Tools (4/5)
  • Basic sprite (a large static image) coding
  • Low latency sprite coding (sent hierarchically)
  • Scalable sprite coding

96
Static Sprite Coding Tools (5/5)
Sprite
Foreground Object

Decoded Frame
97
Wavelet Tool (1/3)
  • Discrete Wavelet Transform (DWT)
  • Still image coding mode
  • Separate DC band Coding
  • Zero-Tree Scanning (ZTS) and Multiscale Zero-Tree
    Entropy (MZTE) coding

98
Wavelet Tool ZTS (2/3)
  • A general architecture for zerotree coding.
  • Provides tradeoffs between scalability,
    complexity and efficiency.

Zero Tree Scanning (ZTE)
99
Wavelet Tool Multiscale ZTE(3/3)
100
Arbitrary Shaped Wavelet Tool
  • Shape-Adaptive Wavelet A more general case of
    rectangular wavelet.
  • Zerotree Coding encodes only the interior nodes
  • Downsampling the original shape to obtain the
    shapes in different resolutions.
  • Using a shape coding scheme to include the shape
    information.

101
Shape adaptive wavelet coding - SNR Scalability
bitstream
30kbits
8kbits
5kbits
102
Shape adaptive wavelet coding - Spatial
Scalability
103
12-Bit Video Coding Tool
  • Allows compression of video data with precision
    of up to 12-bits/pixel
  • The syntax, semantics, and coding tools are
    extended
  • bit-precision
  • extended DC VLC tables
  • extended quantization mechanism
  • Insertion of marker bits to avoid start code
    emulations

104
MPEG-4 demo
  • Coastguard
  • Original 128 Kbits/sec
  • Foreman
  • Original 128 Kbits/sec
  • Hall_monitor
  • Original 128 Kbits/sec

105
Summary (1/2)
  • MPEG-4
  • The first content-based standard, addressing
    multimedia.
  • Object-based representation of a scene.
  • Both natural and synthetic.
  • Compression many other features.
  • Normative Decoder (i.e bitstream syntax and
    decoding algorithm).

106
Summary (2/2)
  • MPEG-4 Visual
  • Tool box approach, i.e. consists of many tools.
  • One tool, one functionality.
  • Set of tools Object.
  • Set of objects Combination Profile.
  • Conformance points on combination profiles.

107
SNHC Tools
  • SNHC Synthetic/Natural Hybrid Coding
  • An MPEG-4 subgroup working on the synthetic
    tools.
  • Tools for Version 1
  • Face Animation
  • Dynamic 2D Meshes
  • Scalable Textures

108
SNHC Tool Face Animation
  • Face an object capable of facial geometry ready
    for rendering and animation
  • A synthetic representation of a human face
  • visual manifestations of speech are intelligible
  • facial expressions allow recognition of moods
  • specified by the parameters in the incoming
    bitstream

109
SNHC Tool Face Animation
Defines a specific face via - 3D feature points -
3D mesh/scene graph - Face Texture - Face
Animation Table
110
SNHC Tool Dynamic Meshes
  • Specifically refers to triangular meshes
  • Tessellation of a 2D visual object plane into a
    connection of triangular patches
  • No addition and deletion of nodes, i.e. no change
    in topology.
  • Used for video object
  • manipulation

111
(No Transcript)
112
SNHC Tool Dynamic Meshes
113
SNHC Tool Dynamic Meshes
114
Conclusions
  • Refer to exercise for further information
  • Other search strategies
  • Motion JPEG (MJPEG)
  • DV
  • Already used for high quality video coding
  • Motion JPEG2000 (MJPEG2000)
  • JPEG 2000 for video
  • Collaborative or Competitive ?
  • Compression ratio may be higher than MPEG-1/2
  • Symmetric algorithm, however
  • Part III of JPEG2000 standard
  • original 40 53 67 80 100

115
Conclusions
  • Standard is usually not the best
  • Demo
  • Avxing
  • Window media video

116
Acknowledgement
  • I would like to thank Prof. Tihao Chiang
    of National Chiao-Tung University and his
    ex-colleagues Dr. Ya-Qin Zhang, Iraj Sodagar, and
    Sriram Sethuraman. Professor Chiang generously
    gave me his transparency master for MPEG-4
    tutorial and this helped me very much in
    preparing this lecture.
Write a Comment
User Comments (0)
About PowerShow.com