FastVDO Unified 16Bit Framework presentation

About This Presentation

Transcript and Presenter's Notes

Title: FastVDO Unified 16Bit Framework

1
FastVDO Unified 16-Bit Framework

Pankaj Topiwala
FastVDO LLC
Columbia, MD 21046 USA
pnt_at_fastvdo.com
JVT-B103

2
In the Beginning (April 01)

April 01 FastVDO showed how H.26L can be made
fully 16-bit with no loss of performance (M16).
At SB 4 other proposals also supported 16-bits
Key lessons learned
Negligible performance or complexity difference
Quantization is very flexible! TI/Sharp showed
that it can be manipulated to make even the tml
transform into 16-bits
Quant. memory rqmts can be minimized by
periodicity
Proper focus the transform which can limit
applications
Quantization can be safely decoupled

3
Rapid Growth of Applications

But the application space of H.26L is growing
low-rate wireless conversational (28 kb/s 256
kb/s),
mid-rate streaming, VOD (64 kb/s 1 Mb/s)
high-rate TV broadcasting (1- 4 Mb/s)
high-rate storage for future DVD (5-30 Mb/s)
Multirate digital cinema (mid-rate, visually
lossless distribution, and lossless archive 30
Mb/s, 200 Mb/s, and low Gb/s)
ultra-high rate HDTV (30 Mb/s low Gb/s)
lossless medical (similar)
Entertainment applications poised to dominate.

4
Desire

One framework that fits all applications
Reduce fragmentation of standard
Limit proliferation of inconsistent technologies
Improve interoperability between profiles
Significantly improve content reuse
FastVDO introduced such a framework
Still inadequately understood
Will now explain concretely

5
Motivation

Similar coding performance as the DCT
Supporting a 16-bit (or less) architecture
Low complexity, multiplierless implementation
(adds, right shifts)
Invertible integer-to-integer mapping
In-place computation

6
DCT

Coding gain 7.57 dB Complexity 8 adds, 6
mults in floating point
Integer approximation

7
Lifting Structure
8
Generic 4-Pt Lifting Transform
e
b
f
c
d
Note If a u are dyadic rationals, then this
- is exactly invertible! - has a mult-free def,
multiple ways, and - is very stingy in bit
expansion!
9
Generic 4-Pt Inverse Transform
-
-
10
Example 1 FastVDO X1 - Hadamard
au1/2 bcdefp1.
Three equivalent implementations - matrix
multiply - mult-free direct (8 adds, 0
scalings) - lifting (also mult-free 8a 2s)
11
Example 2 FastVDO X2
a1/2 bcdef1, up-1.
Three equivalent implementations - matrix
multiply - mult-free direct (8 adds, 1
scalings) - lifting (also mult-free 8a 1s)
12
Example 3 FastVDO X3
a1/2 bcdef1, p-2,u2/5.
W.K.Cham, 1989. X3 proposed by MS,
Nokia. Non-dyadic numbers mean Non-invertible
transform.
Three equivalent implementations
- matrix multiply - mult-free direct (8 adds,
2 scales) - lifting (but with mults!! or
approx. u)
13
Example 4 FastVDO X4
apu1/2 bcdef1.
Note High Coding Gain CG(X4) 7.55 dB CG(DCT)
7.57 dB
Three equivalent implementations
- matrix multiply - mult-free direct (9 adds,
2 scalings) - lifting (also mult-free 8a 3s)
14
Example 5 FastVDO X5
a1/2 bcdef1, p7/16, u3/8.
Note High Coding Gain CG(X5) 7.57 dB CG(TML)
7.57 dB CG(DCT) 7.57 dB
Three equivalent implementations - matrix
multiply - mult-free direct (9 adds, 7 scales)
- lifting (also mult-free 10a 5s)
15
Detailed Implementation of X5
16
Performance-Complexity
17
Dynamic Range
18
8 x 8 BinDCT
Coding gain 8.77 to 8.82 dB for AR(1) process
with p0.95
19
16 x 16 BinDCT
Coding gain 9.4499 dB for AR(1) process with
p0.95
20
Lessons Learned

All transforms considered fall under our rubric
(other than tml)
No new transforms introduced in 9 months
Growing app. list needs transform innovation
Quantization is very flexible
Innovations have in fact been made in
quantization
Sharp/TI showed that even tml can be made 16-bits
Quantization can be adapted to transform

21
Lessons Learned (2) - TML

TML transform
OK for low-complexity, wireless app. using fixed
hard-wired architectures that need matrix
multiply
But unfriendly integers not good for ASIC
Not optimized for bit preservation, high-rate
apps.
Not invertible
Not generalizable to larger transforms
Satisfies one transform method only direct
matrix multiply

22
Lessons Learned (3) - Cham

OK for matrix and mult-free applications
Notionally adds 6 bits in forward transform
Needs truncation for higher-bit data
Likely penalty for high-rate, high-bit sources
Testing on high-bit data critical
Is not invertible (lifting not dyadic rational)
Does not generalize to higher transforms
Satisfies 2 (of 3) transform methods

23
Relative Merits

General Comparisons
All contenders match current 32-bit performance
All offer reduced, nearly identical complexity
Unique Advantages
General framework to address broad range of needs
Very tight bit control
Demonstrated 16-bits output for 12-bit input, no
truncation
Suitable for higher-bit data, and high rates
Related designs for higher sizes (8-pt, 16-pt)
Advantages of lifting improve further with size

24
Currently No Concensus

First address the low-complexity, low-rate
problem
Consider high-rate problem later, probably with a
different transform
If lossless is needed, probably a 3rd transform
Energy misdirected to date
Some proponents backed single transforms, assumed
original
Tests for performance, complexity metrics
inconclusive
Missing the bigger picture support a wide
variety of apps
Our vision use a single framework if possible
Going forward focus on our individual strengths

25
Recommendations

Transform and Quantization can be decoupled
Adopt the framework
Prefer downloadable filters
Innovate in the transforms, goal of 3 transform
methods
Finalize in the reflector, adopt in May
Tailor transform to wide variety of applications
Transform Activity can work directly in
conjunction with other groups (e.g., Trans. Size,
ABT, Interlace, Quantization, )
Quant can focus on transform adaptation, finer
quantization, periodicity, etc.

26
Recommendations (2)

Focus transforms on high-quality, high-rate
Low-complexity case well understood
High-rate apps just emerging in JVT
Streaming, VOD
Broadcast (interlaced)
Film
Storage (DVD)
Higher block sizes
Review ABT options
Digital Cinema -- we have data
Look for synergies with low-complexity case

Write a Comment

User Comments (0)

About PowerShow.com

FastVDO Unified 16Bit Framework PowerPoint PPT Presentation