Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference o PowerPoint PPT Presentation

presentation player overlay
1 / 44
About This Presentation
Transcript and Presenter's Notes

Title: Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference o


1
Factoring and Eliminating Common Subexpressions
in Polynomial ExpressionsInternational
Conference on Computer Aided Design (ICCAD), 2004
  • Farzan Fallah
  • Advanced CAD Research
  • Fujitsu Labs. of America

Anup Hosangadi Ryan Kastner ECE Department, UCSB
2
Outline
  • Introduction
  • Related Work
  • Algebraic techniques for redundancy elimination
  • Experimental results
  • Conclusions

3
Introduction
  • Embedded system applications need to compute
    polynomial expressions
  • Continuous functions can be approximated by
    polynomials to desired degree of accuracy
  • Adaptive signal processing (Polynomial filters )
  • Polynomial interpolation/extrapolation in
    Computer Graphics
  • Encryption

4
Introduction
  • Multiplications are expensive in Embedded systems
  • No good optimization tool for reducing complexity
    of polynomials
  • Designers rely on Hand optimized libraries
  • Conventional optimization techniques
  • CSE, Value numbering not suited for polynomials
  • Horner form most popular representation
  • anxn a1xn-1 .an-1x a0 (((anx an-1)x
    an-2)x ..a1)x a0
  • Not good for multivariate polynomials
  • Only a single polynomial expression at a time

5
Introduction
  • Quartic-spline polynomial (3-D graphics)
  • P zu4 4avu3 6bu2v2 4uv3w qv4
  • Horner form (from MapleTM)
  • P zu4 (4au3 (6bu2 (4uw qv)v)v)v
  • (17 multiplications)
  • Proposed algebraic method
  • d1 v2 d2 d1v
  • P u3(uz ad2) d1( qd1 u(wd2 6bu) )
  • (11 multiplications)

6
Related Work
  • Expression Factorization (M.A.Breuer JACM69)
  • Allows only one kind of operator at a time
  • Symbolic algebra techniques
  • (A. Peymandoust, De Micheli DAC01)
  • Used for mapping DSP datapaths (polynomials) to
    library elements
  • Results depend upon exponential library search
  • e.g. a2 b2 (ab)(a-b) iff (ab) or (a b)
    is in library
  • Manipulates only one expression at a time

F1 A B C D F2 A P D
gt Extract (A D)
7
Motivating Example
  • Consider set of expressions
  • Naïve implementation 16 multiplications, 4
    additions/subtractions
  • Using CSE
  • 12 multiplications, 4 additions/subtractions

8
Motivating Example
  • Using our algebraic techniques
  • Total 7 multiplications, 3 additions/subtractions
  • Savings of 5 multiplications, 1
    addition/subtraction compared to CSE
  • Impossible to obtain such results using
    conventional techniques

9
Introduction to algebraic techniques for
redundancy elimination
  • Algebraic techniques in multi-level logic
    synthesis (MLLS)
  • Decomposition, factoring reduce number of
    literals
  • Distill and Condense use Rectangle Covering
    methods
  • Polynomial Expressions (Our Technique)
  • Factoring, Single term common subexpressions
    reduces number of multiplications
  • Multiple term common subexpressions reduces
    number of additions and possibly multiplications
  • Key Differences (Generalization to handle higher
    orders)
  • Kernelling techniques
  • Finding single cube intersections

10
Introduction to our technique(Outline)
  • Find a subset of all possible subexpressions
    (kernel generation)
  • Transformation of Polynomial Expressions
  • Problem formulation
  • Extract multiple term common subexpressions and
    factors
  • Extract single term common factors

11
Introduction to our technique
  • Terminology
  • Literal A variable or a constant eg. a,b,2,3.14
  • Cube Product of literals e.g. 3a2b, -2a3b2c
  • SOP Sum of cubes e.g. 3a2b 2a3b2c
  • Cube-free expression No literal or cube can
    divide all the cubes of the expression
  • Kernel A cube free sub-expression of an
    expression, e.g. 3 2abc
  • Co-Kernel A cube that is used to divide an
    expression to get a kernel, e.g. a2b

12
Introduction to our Technique
  • Matrix Representation of Polynomial Expressions
  • F x3y xy2z is represented by
  • Each row represents a product term
  • Each column represents a variable/constant
  • Each element (i,j) represents power of variable j
    in term i

13
Generation of Kernels (example)
  • P1 x3y x2y2z L x,y,z
  • Divide by x
  • Ft P1/x x2y xy2z

14
Generation of Kernels (example)
  • Ft P1/x x2y xy2z
  • C Biggest Cube dividing all cubes of Ft

/ C
1 1 0
C
xy
15
Generation of Kernels (example)
  • Obtain Kernel
  • F1 Ft/C (x2y xy2z)/(xy) ( x yz)
  • Obtain Co-Kernel
  • D1 x(xy) x2y
  • No kernels within F1. Go back to P1
  • P1 x3y x2y2z
  • Divide now by next variable y
  • Ft x3 x2yz
  • C x2
  • But (x lt y) e C
  • Stop Here, to avoid repeating same kernel Ft/C
    (x yz)
  • No more kernels extracted
  • Record kernel F1 P1 with co-kernel 1

16
Concept of kernels and co-kernels
  • Theorem Two expressions f and g can have a
    multiple term common subexpression iff there are
    2 kernels Kf and Kg having a multiple term
    intersection
  • Detection of multiple term common subexpressions
    by intersection of sets of kernels
  • Each co-kernel kernel pair represents a
    possible factorization
  • e.g. x3y x2y2z x2y(x yz)
  • Set of kernels a subset of all possible
    subexpressions

17
All Kernels and Co Kernels
Which kernels to choose?
18
Kernel Cube Matrix (KCM)
  • One row for each Kernel generated
  • One column for each distinct kernel cube
  • Each non-zero element represents a term

x3y
19
Finding Kernel Intersections(Distill Algorithm)
  • Each kernel intersection or factor appears as a
    rectangle
  • Rectangle Set of rows and columns such that all
    elements are 1
  • Value of a rectangle weighted sum of the number
    of operations saved
  • Goal Maximum valued rectangular covering of KCM
  • Greedy heuristic covering by prime rectangles
  • Prime rectangle Rectangle not covered by any
    other rectangle

20
Finding Kernel Intersections (Distill Algorithm)
  • Formula for Value of a rectangle
  • R number of rows
  • C number of columns
  • M(Ri) of multiplications in row (co-kernel)
    i.
  • M(Ci) of multiplications in column
    (kernel-cube) i
  • m ratio of weights of multiplication to
    addition
  • Value

Formula calculates savings in operation count
21
Distill Algorithm
22
Distill Algorithm
4xy x2y xyd2 d2 4 x Saves 2
multiplications
Remove covered terms
23
Distill Algorithm
  • Distill algorithm exits after no more kernel
    intersections can be found

P1 x2yd1 d1 x yz
P2 4d1 xyz d2 4 - x P3 xyd1
Can further optimize by finding single cube
intersections
24
Finding single cube intersections (Condense
Algorithm)
  • Need an algorithm for finding single term common
    subexpressions
  • Consider two single term expressions
  • F1 a4b3c
  • F2 a2b4c2
  • Form Cube Variable Incidence Matrix (CIM)

One row for each product term One column for each
variable
25
Finding single cube intersections (Condense
algorithm)
  • Each (single term) common subexpression appears
    as a rectangle.
  • Rectangle Set of rows and columns where all
    elements are non-zero
  • Value of a rectangle is number of multiplications
    saved by selecting it
  • C cube corresponding to the rectangle
  • Value Rows( (SCi ) -1)
  • Maximum valued rectangular covering will give
    minimum number of multiplications
  • Use greedy iterative covering by prime rectangles

26
Finding single cube intersections (Condense
algorithm)
a4b3c
a2d1
a2b4c2
bc
a2b3c
d1 a2b3c
d2 bc
27
Finding single cube intersections (Condense
algorithm)
d3 a2
28
Finding single cube intersections (Condense
algorithm)
  • Final CIM
  • Final Implementation ( 7 multiplications)

29
Distill Algorithm
  • Distill algorithm exits after no more kernel
    intersections can be found

P1 x2yd1 d1 x yz
P2 4d1 xyz d2 4 - x P3 xyd1
Can further optimize by finding single cube
intersections
30
Cube Literal Matrix (Condense Algorithm)
CIM for our example after Distill algorithm
Save 2 multiplications by extracting xy
31
Condense Algorithm
Extracting xy
No more favorable cube intersections found
32
Final Implementation
  • Total 7 multiplications, 3 additions/subtractions
  • Savings of 5 multiplications, 1
    addition/subtraction compared to CSE
  • Impossible to obtain such results using
    conventional techniques

33
Optimization of sin(x)
34
Optimization of sin(x)
35
Optimization of sin(x)
36
Optimization of sin(x)
  • Final Implementation
  • X xx
  • Sin(x) x(1 (-S3 (S5 S7X)X) ) X)
  • Total 5 multiplications and 3 additions/subtractio
    ns
  • SAME AS GNU C HAND optimized form

37
Experimental Setup (Sequential processor)
  • Signal processing and multimedia applications
  • MP3 decoder, Mesa (graphics), Adaptive filter,
    FFT, FIR
  • Taylor series approximation of trigonometric
    functions
  • Optimizations on arithmetic subgraphs from
    Dataflow graphs (DFGs)
  • Polynomials from computer graphics
  • Multivariate polynomial approximation
  • Compared number of operations with CSE and Horner
    form
  • Estimated savings in clock cycles on ARM core

38
Experimental results (comparing number of
operations from different methods)
Average run time 0.45s for our technique
39
Experimental results (Improvement over CSE and
Horner)
40
Conclusions
  • Development of new algebraic technique for
    optimizing polynomial expressions
  • Currently used for minimizing number of
    arithmetic operations using greedy rectangular
    covering
  • Results better than conventional techniques

41
Future Work
  • Develop and implement optimal algorithms to
    compare results with our greedy heuristic
  • Optimization for delay, energy
  • Impact of optimizations on stability

42
Thank You
  • Questions ??

43
  • Extra slides

44
Finding Kernel Intersections(Distill Algorithm)
  • Worst case scenario for Distill algorithm
  • Number of prime rectangles exponential in number
    of rows/columns
  • Heuristic methods to find best prime rectangle
  • In practice polynomial expressions are not so
    large
Write a Comment
User Comments (0)
About PowerShow.com