Title: Video Processing
1Video Processing in Compressed Domain
By Prof. Jayanta Mukhopadhyay
2Video Resizing
3MPEG Introduction
(DCT, Quant., Motion Estimation Compensation,
VLC)
Encoding
INTRA
INTRA
Motion Compensated Inter Frames
(IDCT, IQuant., Inverse Motion Compensation, VLC)
Decoding
Details
4Compressed Domain (DCT) Processing
Spatial Domain
Compressed Domain
VLC Decoder
Inverse Quantization
MPEG video
IDCT
8 x 8 DCT blocks
Processing Box
8 x 8 DCT blocks
VLC Encoder
MPEG video
Quantization
DCT
5Video Downscaling
Applications Browsing remote video database, PIP,
video conferencing, transcoding etc.
- Approaches
- Spatial Domain Technique
- Hybrid (Spatial DCT) Technique
- Pure DCT Domain Technique
6Video Resizing
Spatial Domain Technique
VLC Decoder
Input Data
Buffer
Q-1
IDCT
Frame Memory
Motion Compensation
Spatial Downscaling
VLC Encoder
Frame Memory
-
DCT
Q
Q-1
Buffer
IDCT
Motion Estimation Compensation
Output
Frame Memory
7Video Resizing
Computation Complexity for P frames from CIF
resolution to QCIF resolution
Function Complexity Mults. Adds Shift
s Inverse Quant. IDCT (144m, 464a per 8x8
block) 228096 734976 Inverse Motion Compensation
(256a per 16x16 block) 101376 Downscale by 2
(3a, 1s per pixel) 76032 25344 Full Search ME
( 15 pels, 738048a per 16x16 block) 73066752 Mo
tion Compensation (256a per 16x16
block) 25344 DCT Quant. (144m, 464a per 8x8
block) 57024 183744 Total 285120 74188224
25344
Total Operations count (Add 1op, Shift 1op,
Mult. 3ops) 75068928
8Video Resizing
DCT based Downscale
Intra DCT blocks
Downscaled DCT Intra Frames
DCT Intra Frame
Intra DCT blocks
DCT based Motion Estimation Compensation
DCT based Inverse Motion Compensation
DCT Inter Frame
Motion Vectors
Downscaled DCT Intra Inter Frames
9Video Resizing
DCT Domain based down-sampling of Intra Frames by
a factor of two
Compressed Bitstream (downscaled)
Compressed Bitstream
Huffman Decoder Dequantizer
Huffman Encoder Quantizer
8x8 DCT Block
8x8 DCT Blocks
x
10Video Resizing
Downscaling Technique for an Intra frame (DCT
Domain)
Computational Complexity Downscaling 1.25m
1.25a per pixel of the Original
frame Upsampling 1.25m 1.25a per pixel of
the Upsampled frame
11Video Resizing
DCT Domain based Inverse Motion Compensation (IMC)
Huffman Encoder And Quantizer
Huffman Decoder And Dequantizer
8x8 DCT Error Blocks
8x8 DCT Intra Blocks
8x8 DCT Blocks
DCT Domain Inverse Motion Compensation
Previous Frame DCT domain data
8x8 DCT Intra Blocks
12Video Resizing
DCT Domain based Inverse Motion Compensation
(Neri Merhavs Scheme)
x2
x1
w
h
E
x3
x4
where cij, i 1, , 4, j 1,2 are sparse 8x8
matrices of zeros and ones.
(Inter)
(Intra)
Expression (1) can be written as
StS I
Where S is a 8-point DCT matrix. S can be
factorized as S D P B1 B2 M A1 A2 A3
Expression (2) can further be written as
X S Jh B2t B1t Pt D ( X1 D P B1 B2 Jwt X2 D
P B1 B2 K8-wt) K8-h B2t B1t Pt D( X3 D P B1
B2 Jwt X4 D P B1 B2 K8-wt) St
Details
Where Ji Ui (M A1 A2 A3)t, and Ki Li (M A1 A2
A3)t i 1,2,8
13Video Resizing
Computation Complexity of the Neri Merhavs IMC
Let w h 4 Total computations Six
multiplications by B1 or B1t 6x8x4
192a Six multiplications by B2 or B2t 6x8x4
192a Two multiplications by Jw and K8-w, and
one by Jh and K8-h 8x(3x(5m 19a 5m
19a) ) 240m 912a One 2D DCT
operation 2x(8x(5m 29a)) 80m
464a Total operations 320m 1760a ( per 8x8
block)
Matrix Computations/column
J1 3m 6a
J2 4m 10a
J3 5m 16a
J4 5m 19a
J5 5m 20a
J6 5m 22a
J7 5m 24a
J8 5m 28a
B1/B1t 4a
B2/B2t 4a
S/St 5m 29a
Operations per pixel 5m 27.5a
14Video Resizing
Modified IMC technique (MBIMC)
x1
x3
x2
r
c
M
x6
x4
E
x9
x7
1 r 8 and 1 c 8
(inter)
(intra)
Where Cr and Cc are row and column selector
matrices of size 16x24 24x16.
0 0 ..0 0 0 ..0 . . . 0 0 ..0
1 0 .....0 0 1 .....0 . . . 0 0
.....1
0 0 ..0 0 0 ..0 . . . 0 0 ..0
cr
16 rows
r-1 columns
16 columns
8-r1 columns
15Video Resizing
Macroblock wise IMC in DCT domain
(A)
Using the 8-point DCT matrix factorization, we
can represent -
Where S is a 8-point DCT matrix. S can be
factorized as S D P B1 B2 M A1 A2 A3
16Video Resizing
The expression (A) can be written as
Let us represent Jr Cr Qt and Kc Q Cc
1 r 8 and 1 c 8
- Jr and Kc will have similar complexities due to
similar structure. - Jr and Kr matrix multiplication can also be
implemented efficiently by - extending the notion of Neri Merhav.
17Video Resizing
Computation Complexity of the Modified IMC scheme
Matrix Computations/column
J1 10m 56a
J2 13m 58a
J3 14m 60a
J4 15m 64a
J5 15m 66a
J6 15m 64a
J7 14m 60a
J8 13m 58a
B1/B1t 12a
B2/B2t 12a
S/St 5m 29a
Let r c 5 Total Computations Two
multiplication of B1 type 2x24x12 576a Two
multiplication of B2 type 2x24x12 576a One
multiplication of Jr Kc 2x24x(15m66a)
720m 3168a Four 2D DCT operation
4x(8x(5m 29a)) 160m 928a Total
computations 880m 5248a (per 16x16 block)
Operations per pixel 3.43m 20.5a
27 improvement on Neri Merhavs Approach
18Video Resizing
PSNR difference between Spatial and MBIMC
technique
Video susi
Video flower
19Video Resizing
Integrated Scheme for (IMC Downscaling)
Downsampling Filter If x1, x2, x3, x4 are 8x8
spatial domain adjacent blocks. The downsampled
block x can be computed as
d dt
(B)
x
Where d is a downsampling filter.
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1
d 0.5
8x16
20Video Resizing
Using expression (A) and (B), we can write
X1 X2 X3 X4 X5 X6 X7 X8 X9
cr
cc
Qt B2t B1t Pt Dt
D P B1 B2 Q
M DCT
d dt
16x16
8x8
Let us represent Jr d Cr Qt and Kc Q Cc
dt 1 r 8 1 c 8
Jr and Kc will have similar structure and similar
computation complexities. But It will have two
different structure when r (or c) is even and
when r (or c) is odd.
Details
21Video Resizing
Computation Complexity of the Integrated Scheme
(IMC Downsampling)
Matrix Computations per column
J1 10m 34a
J2 17m 44a
J3 14m 38a
J4 16m 43a
J5 15m 41a
J6 17m 44a
J7 14m 38a
J8 15m 44a
Let r c 6 Two multiplication of B1/B1t 2 x
12 x 24 576a Two multiplication of B2/B2t 2
x 12 x 24 576a One multiplication by J6 24
x (17m 44a) 408m 1056a One multiplication
by K6 24 x (17m 44a) 408m 1056a One
2D-DCT operation 8 x (5m 29a) 40m
232a Total computations 856m 3496a ( per
16x16 block)
Operations per pixel 3.34m 13.65a
40 improvement on Neri Merhavs Approach
22Video Resizing
PSNR difference between Spatial and Integrated
Scheme
Video Mobile
Video Flower
23Video Resizing
Average PSNR Comparison Chart (Videos are
downscaled from CIF (1.15 Mbps) to QCIF (512 Kbps)
Spatial Domain Scheme Spatial Domain Scheme Neri Merhav Scheme Neri Merhav Scheme MBIMC Scheme MBIMC Scheme Integrated Scheme Integrated Scheme
Video I P I P I P I P
Susi 32.32 32.58 35.90 35.85 35.90 35.85 35.90 33.92
Tennis 24.04 23.90 25.95 25.53 25.95 25.53 25.95 23.89
Mobile 21.03 22.50 22.62 24.30 22.62 24.30 22.62 22.88
Flower 21.99 23.20 24.07 25.62 24.07 25.62 24.07 23.85
24Video Resizing
Motion Vector Re-estimation
25Video Resizing
Algorithms for Motion Vector Re-estimation
- Adaptive Motion Vector Resampling Technique
(AMVR) - Maximum Average Correlation (MAC)
- Median Method
- Non-Linear Motion Vector Resampling Technique
(NLMR) - And many more
26Video Resizing
Comparison of Motion Vector Re-estimation Methods
Video Coastguard Frames 300 From CIF
(1.15 Mbps) To QCIF (500 Kbps)
27Video Resizing
Comparison of Motion Vector Re-estimation Methods
Video Container Frames 300 From CIF
(1.15 Mbps) To QCIF (500 Kbps)
28Video Resizing
Pure DCT Domain based Proposed System
DCT blocks
VLC Decoder
Input Data
Buffer
Q-1
Motion Vector
AMVR
DCT Frame
MBIMC Scheme
DCT Downscaling
Intra DCT Blocks
Q Step Size
DCT Frame
-
Q
VLC Encoder
Q-1
MTSS
Buffer
DCT Based Motion Compensation
Frame Memory
Output
29Video Resizing
Computational Complexity of Proposed System
Function Complexity Mults. Adds Shift
s Inverse Quant. (64m per 8x8 block) 101376 MB
IMC (3.43m, 20.5a per pixel) 347720 2078208 DCT
downscale by 2 (1.25m, 1.25a per
pixel) 126720 126720 AMVR (9m, 30a, 1shift per
16x16 block) 891 2970 99 DCT domain MC (3.43m,
20.5a per pixel) 86930 519552 Quant. (64m per
8x8 block) 25344 Total 688981 2727450 99
(Conversion of P frame from CIF to QCIF)
Total Operations count (Add 1op, Shift 1op,
Mult. 3ops) 4794492
16 times faster than Spatial Domain Method
30Video Resizing
Comparison of Pure DCT and Hybrid System
Avg. PSNR Hybrid (24.2630 32.4528
32.4523) Pure DCT (25.7454 42.2119 43.1655)
31Video Resizing
Comparison of Pure DCT and Spatial System
Avg. PSNR Spatial (25.1723 32.5405
32.5595) Pure DCT (25.7454 42.2119 43.1655)
32Video Resizing
Optimization to Pure DCT based proposed system
(Utilizing the sparseness of DCT blocks)
Function Complexity Mults. Adds Shift
s Inverse Quant. (64m per 8x8 block) 101376 MB
IMC (0.9m, 6.8a per pixel) 91238 689357 (assumi
ng only 16 non-zero coeff.) DCT downscale by 2
(1.25m, 1.25a per pixel) 126720 126720 AMVR
(9m, 30a, 1shift per 16x16 block) 891 2970 99 D
CT domain MC (0.9m, 6.8a per pixel) 22810 172339
(assuming only 16 non-zero coeff.) Quant. (64m
per 8x8 block) 25344 Total 368379 991386
99
(Conversion of P frame from CIF to QCIF)
Total Operations count (Add 1op, Shift 1op,
Mult. 3ops) 2096622
36 times faster than Spatial Domain Method
33Video Resizing
Comparison of Optimized Pure DCT and Hybrid System
Avg. PSNR Hybrid (24.2630 32.4528
32.4523) Optimized Pure DCT (25.1310 42.2768
43.2303 )
34Video Resizing
Comparison of Optimized Pure DCT and Spatial
System
Avg. PSNR Spatial (25.1723 32.5405
32.5595) Optimized Pure DCT (25.1310 42.2768
43.2303 )
35Video Resizing
Average PSNR Comparison Chart (Videos are
downscaled from CIF (1.15 Mbps) to QCIF (512 Kbps)
Spatial Domain Method Spatial Domain Method Spatial Domain Method Hybrid Domain Method Hybrid Domain Method Hybrid Domain Method DCT Domain Method (assuming 16 non-zero coeff.) DCT Domain Method (assuming 16 non-zero coeff.) DCT Domain Method (assuming 16 non-zero coeff.)
Video Y U V Y U V Y U V
Coastguard 25.17 32.54 32.55 24.26 32.45 32.45 25.13 42.22 43.16
Foreman 28.61 32.20 32.08 28.29 32.00 31.92 29.42 40.09 41.09
Susi 33.90 32.94 32.44 33.86 32.93 32.43 36.83 40.92 40.73
Tennis 24.98 32.36 31.58 24.97 32.36 31.57 26.49 41.60 41.95
36Conclusion
The modified IMC (MBIMC) scheme provided 27
improvement over the existing IMC technique. The
Integrated (IMCdownscaling) scheme provides 40
improvement. Our proposed DCT domain based
video downscaling system is 36 times faster than
spatial domain method. Our proposed DCT domain
based video downscaling system produces
approx. 1.5 dB better output than Hybrid and
spatial domain system.
37Results
Thank You.
½
½
38MPEG Introduction
A Typical MPEG stream Structure
Seq. Header
GOP Header
GOP
GOP Header
GOP
MPEG End Code
-----------------------
I B B P B B P B B P B B I B B . . . I P P P P P P
P P P P P I P P . . . I I I I I I .
Pic. Header
PIC
-----------------------
MacroBlock header
Macroblock
Block 8x8
Block 8x8
Block 8x8
39MPEG Introduction
Types of Frames in an MPEG stream
- I-frames are intra compressed as in JPEG. There
purpose is to provide random - Access points to the video.
- P-frames are motion compensated forward
predictive coded frames, they are - Inter-frame compressed, and typically provide
more compression then I-frames. - B-frames are motion compensated bi-directionally
predictive coded frames, they - Are inter-frame compressed, typically provide
most compression.
40MPEG Introduction
Intra Frame Encoding
DCT
Quant.
For each 8x8 block
Zig-Zag Scan
Huffman
RLE
011000011010
41MPEG Introduction
P-Frame Encoding
B-Frame Encoding
Back
42MPEG Introduction
Motion Estimation Prediction to construct Inter
Frame (P/B- frames)
m
e m m
Reference
Back
43Video Resizing
DCT Domain based Inverse Motion Compensation
(Neri Merhavs Scheme)
x2
x1
w
h
E
(1)
x3
x4
(Inter)
(Intra)
where cij, i 1, , 4, j 1,2 are sparse 8x8
matrices of zeros and ones. These matrices
perform window and shift operations.
0 Ih 0 0
c31 c41 L8-h
c11 c21 Uh
c22 c42 U8-w
0 0 Iw 0
c12 c32 Lw
Here, Ih and Iw are indentity matrices of
dimension h x h and w x w, respectively.
44Video Resizing
Expression (1) can be written as
X
(2)
StS I
Expression (2) can further be written as
Where S is a 8-point DCT matrix. S can be
factorized as S D P B1 B2 M A1 A2 A3 D is a
diagonal matrix. P is a permutation matrix. B1,
B2, A1, A2, A3 are sparse matrices of 0, 1 and
-1. M is a sparse matrix of real number.
45Video Resizing
Expression (2) can further be written as
X S Jh B2t B1t Pt D ( X1 D P B1 B2 Jwt X2 D
P B1 B2 K8-wt) K8-h B2t B1t Pt D( X3 D P B1
B2 Jwt X4 D P B1 B2 K8-wt) St
Where Ji Ui (M A1 A2 A3)t, and Ki Li (M A1 A2
A3)t i 1,2,8
J6
Where a 0.7071, b 0.9239 and c 0.3827
Back
46Video Resizing
Adaptive Motion Vector Resampling Technique (AMVR)
Where mv New motion vector for downsampled
macroblock Mvi Motion vector of the ith
macroblock Ai Activity measurement of the
residual block i (in the original video)
(It could be number of non-zeros entries in the
block.)
Back
47Video Resizing
Maximum Average Correlation (MAC) Method
Where v vector with maximum weighted
average correlation with the four motion
vectors in the original video bit stream. Wk
activity measurement of kth macroblock dk
Euclidian distance between v and motion vector of
kth macroblock ? 0.85, the spatial
correlation factor
-
-
Back
48Video Resizing
Median Method
Let V v1, v2, v3, v4 represents four
adjacent motion vectors.
The median vector is defined as
med (V) vk ? V such that min di dk
For video halving
-
v ½ vk
Back
49Video Resizing
Non-linear motion vector resampling (NLMR) method
- This methods suggests that the following
parameters are statistically related. - Activity of macroblock (A)
- Quantization step size (Q)
- Number of clustered motion vector (C)
- Magnitude of motion vector (M)
- As A, Q, C increases and M decreases,
probability that the corresponding - Motion vector is best motion vector is
increased. - A likelihood score Li is computed for each
macroblock. - Motion vector corresponding to highest Li is
selected.
Back
50Macroblock Type Selection Scheme (MTSS)
Intra
Intra
Intra
Predicted
Intra
Predicted
Intra
Predicted
Predicted
Intra
Back
51Video Resizing
Hybrid System (Computational Complexity)
Function Complexity Mults. Adds Shift
s Inverse Quant. IDCT (144m, 464a per 8x8
block) 228096 734976 Inverse Motion Compensation
(256a per 16x16 block) 101376 Downscale by 2
(3a, 1s per pixel) 76032 25344 AMVR (9m, 30a,
1shift per 16x16 block) 891 2970 99 Motion
Compensation (256a per 16x16 block) 25344 DCT
Quant. (144m, 464a per 8x8 block) 57024 183744
Total 285120 1124442 25344
(Conversion of P frame from CIF to QCIF)
Total Operations count (Add 1op, Shift 1op,
Mult. 3ops) 2007918 Speed Up 37 times from
Spatial domain technique
52Video Resizing
Jr matrices when r is even and when r is odd.
2 -2 0 0 D 2A E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 -2A 02 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 -2 0 0 -D -2A -E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 A 1 C 0 -B -1 1 1 A 1 -C 0 B 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 -2 0 0 D 2A E 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 2 -2A -2 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 -2 0 0 -D -2A -E 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 1 A 1 C 0 -B -1 1 1 A 1 -C 0 B 1
J2
Where A 0.7071, B 0.9239, C 0.3827, D
.5412 and E 1.3066
2 0 -2A -1 -2B -A -2C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 2A 1 2C -A -2B -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 2A 1 -2C A 2B 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 -2A -1 2B A 2C 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 -2A -1 -2B -A -2C 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 0 2A 1 2C -A -2B -1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 2A 1 -2C A 2B 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 -2A -1 -2B A 2C 0
J5
Where A 0.7071, B 0.9239 and C 0.3827.
53Video Resizing
Efficient scheme to compute u J2 v where u
(u1, .., u8)t and v (v1, , v24)t
Y1 (v1v2) Y1 (v9v10) Y1 (v17v18)
Y2 (v1-v2) Y2 (v9-v10) Y2 (v17-v18)
Y3 Av3 Y3 Av11 Y3 Av19
Y4 Dv5 Y4 Dv13 Y4 Dv21
Y5 Ev7 Y5 Ev15 Y5 Ev23
Y6 Cv5 Y6 Cv13 Y6 Cv21
Y7 Bv7 Y7 Bv15 Y7 Bv23
Y8 Av6 Y8 Av14 Y8 Av22
Y9 Y6-Y7 Y9 Y6 Y7 Y9 y6 Y7
u1 2Y2 Y4 2Y8 Y5 u2 2Y1 -2Y3 2v4 u3
2Y2 Y4 -2Y8 Y5 u4 Y1 Y3 v4 Y9 v8
Y1 Y3 v12 Y9 v16 u5 2Y2 Y4
2Y8 Y5 u6 2Y1 -2Y3 2v12 u7 2Y2 Y4
-2Y8 Y5 u8 Y1 Y3 v12 Y9 v16
Y1 Y3 v20 Y9 v24
54Video Resizing
Efficient scheme to compute u J5 v where u
(u1, .., u8)t and v (v1, , v24)t
Y1 2Av3 v4 Y1 2Av11 v12 Y1 2Av19 v20
Y2 2(BC)(v5v7) Y2 2(BC)(v13v15) Y2 2(BC)(v21v23)
Y3 2Cv5 Y3 2Cv13 Y3 2Cv21
Y4 2Bv7 Y4 2Bv15 Y4 2Bv23
Y5 Av6 Y5 Av14 Y5 Av22
Y6 Y2 Y3 Y4 Y6 Y2 Y3 Y4 Y6 Y2 Y3 Y4
Y7 Y5 v8 Y7 Y5 v16 Y7 Y5 v24
Y8 Y3 Y4 Y7 Y8 Y3 Y4 Y7 Y8 Y3 Y4 Y7
u1 2v1 Y1 Y6 Y5 u2 2v1 Y1 Y8 u3 2v9
Y1 Y8 u4 2v9 Y1 Y6 Y5 u5 2v9
Y1 Y6 Y5 u6 2v9 Y1 Y8 u7 2v17
Y1 Y8 u8 2v17 Y1 Y6 Y5
Back
55H.264 Resizing
56Relation between Integer DCT and Real DCT
57- To simplify the implementation, d is approximated
by 0.5. - To ensure that the transform remains orthogonal,
b also needs to be modified such that
58- The 2nd and 4th rows of matrix C and the 2nd and
4th columns of matrix CT are scaled by a factor
of 2 - The post-scaling matrix E is scaled down to
compensate.
Ef
- This transform is an approximation to the 4x4 DCT
but not equal to it. - Forward transform and inverse transform are not
the same.
59Ei
- The forward and inverse transforms are orthogonal
- T-1(T(X)) X.
- Ef and Ei are scaling matrices that can be
incorporated into thequantizer. Hence - Real forward DCT Input is trasformed by
Integer forward transfom and then sacled by Ef. - Real Inverse DCT input scaled by Ei and
thentransformed by Integer Inverse transform
60Conversion of a H.264 P frame to an I frame
- Macroblock is be partitioned into any of the
seven types 16x16,16x8,8x16,8x8,8x4,4x8,4x4 - For each macroblock partition type there may be
10 prediction types.
- Full pel prediction
- Horizontal only Half pel or quarter pel
- Vertical only Half pel or quarter pel
- Horizontal and then vertical Half pel or
quarter pel - Vertical and then Horizontal Half pel or
quarter pel - Diagonal prediction Half pel or quarter pel
61What is Transcoding?
Transcoding A Process in which a coded bit
stream is converted
into another one of different
bit rate, or different format.
Bit stream of Different bit rate, or Different
Format
Pre-encoded Bit stream
Transcoder
62Pixel Domain Transcoder(PDT)
yuv frames
H264 video file
MPEG-2Video file
63MPEG-2 encoder vs. PDT
Pixel Domain Transcoder Frame
MPEG-2 Frame
64MPEG-2 encoder vs. PDT contd.
MPEG-2 Encoder Vs Pixel Domain Transcoder
65DCT Domain Transcoder
VLD
IQ
Q2
VLC
-
IQ
Motion Vectors and Block types
MEMORY
MC- DCT
66Enhancement of PDT
67Adaptive Motion Vector Re-estimation(AMVR)
- Weighted average approach
- Align to best prediction error vector
- Criteria if the object boundary blocks have low
prediction error than background blocks. - Align to worst prediction error vector
- Criteria if the object boundary blocks haveHigh
prediction error than background blocks
68AMVR-Contd
MVi is the motion vector of block i of H.264. Ai
is denotes the activity measurement of the block.
69Median Method
Extracts the motion vector situated in the middle
of the rest of the motion vectors
70Non-Linear Motion Vector Re-estimation
- Minimum distance from the optimal is the best
matching motion vector - Four parameters are defined for each block
- A Activity measurement
- C Cluster of motion vector
- Q Quantization step size
- M Magnitude of motion vector
71NLMR contd
- For each referenced block i, Li is defined(Li is
the likelihood score that the block is matching
with the optimal) - Li is incremented when any Ai,Ci,Qi is highest or
Mi is lowest among 16 blocks. - Motion vector corresponding to higest L is the
best matching motion vector.
72(No Transcript)
73(No Transcript)
74(No Transcript)
75(No Transcript)
76Computations Required
Method Additions and Subtractions Multiplicationsand divisions Shifts and Comparisons Total SavingPer Frame
Full Search 9669s 2733a 22m 262s 836c 23622 0
AMVR 50a 34m2d 1c 87 99.6 of ME time
Median 242a 2d 16c 260 98.89 of ME time
NLMR 512s22a 24m2d 352c 912 96.13 of ME time
77(No Transcript)