Title: Combined LNS Adder/Subtractors for DCT Hardware
1Combined LNS Adder/Subtractors for DCT Hardware
2Outline
- Logarithmic Number System (LNS)
- Discrete Cosine Transform (DCT)
- Combined LNS adder/subtractor
3LNS (Logarithmic Number System)
- Represents a number by a sign bit and an exponent
to a certain base b
Exponent (n-1 bits)
S
F (Precision)
4Properties of LNS
- Large dynamic range
- Easy for multiplications, divisions and
exponentiations
- Additions are not linear operations for LNS
- Cost of adders is exponential to word lengths
- Have advantages at low precisions
5LNS Arithmetic Units
- Multiplication
- logb(XY) logbX logbY
- The cost is a fixed-point adder
- Addition
- More complex process than multiplication
- E.g., when calculating logb(XY),
- (xlogbX, ylogbY)
- Calculate zx-y
ZX/Y - Table-lookup sb(z)logb(1bz)
1X/Y - logb(XY)ysb(z)
Y(1X/Y)XY - Subtraction
- db(z)logb1-bz
6LNS Multiplication and Addition
xlogbX, ylogbY
LNS multiplication
LNS addition
x
x
sb(z)
z
logb(XY)
logb(XY)
_
y
xy
ysb(z)
y
db(z)
(ydb(z) when Sx?Sy)
sb(z)logb(12z)
db(z)logb1-2z
7Discrete Cosine Transform
- An important part in MPEG encoding
- 2-D DCT usually performed through 2 rounds of 1-D
DCT to reduce the hardware cost
8LNS DCT in MPEG encoding
- Floating-point cost is too high for portable
systems - LNS has the same visual result as fixed-point at
the same precisions - LNS have shorter word length than fixed-point
numbers - At the same dynamic range and precisions for
MPEG-1 - Fixed-point (12F) bits
- LNS (6F) bits
9Fast DCT algorithm
- Chens 1-D DCT algorithm (one cycle)
- Directly factorizes the DCT matrix
- 16 multiplications
- 26 additions
- Perform one 8-point 1-D DCT in one cycle
- Two-cycle version by reusing hardware
- 14 adders
- 10 multipliers
- Perform one 8-point 1-D DCT in two cycles
10Diagram of Chens 1-D DCT
S(1/4)
f(0)
F(0)
-
C(1/4)
f(1)
F(4)
-
S(1/8)
f(2)
F(2)
C(1/8)
-
-C(1/8)
f(3)
F(6)
S(1/8)
-
S(1/16)
f(4)
F(1)
-
C(1/16)
-
C(1/4)
S(5/16)
-
f(5)
F(5)
-
C(5/16)
S(1/4)
-S(3/16)
-
f(6)
F(3)
-
C(3/16)
-S(7/16)
F(7)
f(7)
C(7/16)
S(m/n)sin(mp/n), C(m/n)cos(mp/n)
11Combined LNS adders/subtractors
- Many computational units as below in DCT
XY
X-Y
- The above two computation always access different
sb(z) table and db(z) table - Share table-lookup part and some combinational
parts in the above two computations
12Combined LNS adder/subtractors
xlogbX, ylogbY
Same hardware
Same address for different tables
- 2. Table-lookup sb(z)logb(12z)
- 2. Table-lookup db(z)logb1-2z
13Combined LNS adder/subtractors (type 1)
logb(XY)
x
sb(z)
ysb(z)
_
(ydb(z) when Sx?Sy)
y
db(z)
logbX-Y
zx-y
ydb(z)
(ysb(z) when Sx?Sy)
14Combined LNS adder/subtractors (type 1)
logb(XY)
x
sb(z)
ysb(z)
_
(ydb(z) when Sx?Sy)
y
db(z)
logbX-Y
zx-y
ydb(z)
(ysb(z) when Sx?Sy)
15Diagram of Chens 1-D DCT
S(1/4)
f(0)
F(0)
-
C(1/4)
f(1)
F(4)
-
S(1/8)
S(1/8)
f(2)
F(2)
C(1/8)
C(1/8)
-
-C(1/8)
-C(1/8)
f(3)
F(6)
S(1/8)
S(1/8)
-
S(1/16)
f(4)
F(1)
-
C(1/16)
-
C(1/4)
S(5/16)
-
f(5)
F(5)
-
C(5/16)
S(1/4)
-S(3/16)
-
f(6)
F(3)
-
C(3/16)
-S(7/16)
F(7)
f(7)
C(7/16)
S(m,n)sin(mp/n), C(m,n)cos(mp/n)
16Combined LNS adder/subtractors
- Some computation units perform blow computations
a1Xa2Y
S(1/8)
C(1/8)
-a2Xa1Y (a1, a2 are constants)
-C(1/8)
S(1/8)
- Access different tables in an LNS adder
- Share table-lookup part
- Add some extra combinational hardware
- The table-lookup of the two computations use
different addresses
17Combined LNS adder/subtractors (type 2)
logb(a1Xa2Y)
logba1X
sb(z)
ysb(z1)
_
z1
(ydb(z1) when Sx?Sy)
logba2Y
db(z)
logb(-a2Xa1Y)
logba2X
ydb(z2)
_
z2
(ysb(z2) when Sx?Sy)
logba1Y
18Portions of table-lookup part in LNS adders
19ROM size with/without combined LNS
adder/subtractors
20Hardware comparison for LNS adder and LNS
adder/subtractors
21LNS adder/subtractors in Chens hardware
LNS adders Ordinary Type 1 Type 2
Direct inferred hardware 26 0 10 3
Two-cycle version hardware 14 4 3 2
22Hardware comparison for Chens DCT algorithm at
F4
23Conclusion
- Significant area savings by combined LNS
adder/subtractors in DCT hardware - Suitable to reduce area in portable MPEG devices
- Some overhead when converting to/from fixed-point