Title: A Dimension Abstraction Approach to Vectorization in Matlab
1A Dimension Abstraction Approach to Vectorization
in Matlab
- Neil Birkbeck
- Jonathan Levesque
- Jose Nelson Amaral
Computing ScienceUniversity of AlbertaEdmonton,
Alberta, Canada
2Problem
- Problem Statement
- Generate equivalent, error-free vectorized source
code for Matlab source while utilizing higher
level matrix operations when possible to improve
efficiency.
3Motivation
- Loop-based code is slower than vector code in
Matlab. - Why?
- interpretive overhead (type/shape checking,)
- resizing of arrays in loops
n1000 for i1n, A(i)B(i)C(i) end
5x faster!
n1000 A(1n)B(1n)C(1n)
- Vectorization also useful for compiled Matlab
code, where optimized vector routines could be
substituted.
4Related Work
- Data dependence vectorization
- Allen Kennedys Codegen algorithm
- Build data dependence graph
- Topological visit strongly connected components
- Abstract Matrix Form (AMF) Menon Pingali
- axioms used to transform array code
- take advantage of matrix multiplication
- Not clear if it is easily extensible or allows
for vectorization of irregular access (e.g.,
access to the diagonal)
5Incorrect Vectorization
Pull out of loop. Index variable substitution
(i?1n)
for i1n, a(i)b(i)c(i) end
a(1n)b(1n)c(1n)
- Vectorization correct if a,b, and c are row
vectors or column vectors
6Incorrect Vectorization
for i1n, x(i)y(i,h)z(h,i) end
- Matlab is untyped
- Vectorization depends on whether h is a vector or
scalar. - If h is a scalar
- Otherwise
x(1n)y(1n,h).z(h,1n)
x(1n)sum(y(1n,h).z(h,1n),2)
7Overview of Solution
Data dependence-based vectorizer
Vectorizable statement
Knowledge of Shape of variables
Propagate dimensionality up parse tree
Dimensions Agree?
Yes
Perform Transformations
No
Leave statement in loop
Output Vector statement
8More Specifically
Examples
- Represent dimensionality of expressions as list
of symbols 1 or (gt1) - Assume known for variables.
- Propagate up parse tree according to Matlab rules
- Compatibility
- dim(A)dim(B) when the lists are equivalent
(after removal of redundant 1s)
9Vectorized Dimensionality
- Vectorized dimensionality
- representation of dimensions after vectorization
of a loop - denoted dimi for loop with index variable i
- Introduce new symbol ri for index variable i
for i1n, a(i)10i end
10Vectorized Dimensionality
- Expressions with incompatible vectorized
dimensionality should not be vectorized. - When do dimensionalities agree?
- Assignment expressions elhserhs
- dimi(elhs)dimi(erhs) erhs(1)
- Element-wise binary operators eelhsTerhs
- dimi(elhs) (1)dimi(erhs)(1)dimi(elhs)dim(er
hs)
T in ,-,.,
Dimensionalities are compatible or erhs is scalar
Dimensionalities are compatible or either is
scalar
11Vectorized Dimensionality
- Rules very restrictive
- Assume dim(A)dim(B)dim(C)(,)
dimi,j(B)(rj,ri) dimi,j(C)(ri,rj) Vectorization
fails because (ri,rj) is not compatible with
(rj,ri)
for i1100, for j1100 A(i,j)B(j,i)C(i,j
) end end
12Transpose Transformation
- Extension to utilize transpose when necessary is
straightforward - For assignment
- if dimi(A)reverse(dimi(B)) then ABT is allowable
for i1m, for j1n A(i,j)B(j,i)
end end
dimi,j(A)reverse(dimi,j(B))(ri,rj) A(1m,1n)(B
(1n,1m))
13Transpose Transformation
- Extension to utilize transpose when necessary is
straightforward - Similar for pointwise operations
- if dimi(A)reverse(dimi(B)) then ATBT is
allowable, propagate dimi(ATBT)dimi(A) - if dimi(reverse(A))dimi(A) then ATTB is
allowable, propagate dimi(ATTB)dimi(B)
14Pattern Database
- Dimensionality disagreement at binary operators
inhibits vectorization. - Recognizing patterns (consisting of operator type
and operand dimensionalities) can be used to
identify a transformation enabling vectorization.
lhs operation rhs output (ri, rj)
T (ri,1) (ri, rj)
Pattern
for i1m, for j1n, A(i,j)B(i,j)C(i)
end end
Transformed Result
B(i,j)C(i)
B(1m,1n)repmat(C(1m),1,n)
15Pattern Database
lhs operation rhs output (ri, ri)
(index) nil (1, ri)
Pattern
for i1n, a(i)A(i,i)b(i) end
a(1n)A((1n)size(A,1)((1n)-1)).b(1n)
Column major indexing of A
16Additive Reduction Statements
- Additive-reduction statements use a loop variable
to perform an accumulation. - Not all loop nest index variables appear in
output dimensionality
17Additive Reduction (Solution)
- Maintain/propagate dimensionality and reduced
variables for an expression. - ?(E) denotes the reduced variables for
expression E - When checking statement A(J)A(J)E
- ensure dimi1,i2,,ikA(J)dimi1,i2,,ik(E) and
?(E)I-J - any variable ri in I-J but not in ?(E) must be
reduced
18Additive Reduction via Matrix Multiplication
- Matrix multiplication can be used to perform
reductions on eelhserhs , provided - dimi1,,ik(elhs)(Sl,rk)
- dimi1,,ik(erhs)(rk,Sr)
- rk is a reduction variable.
- Implies
- dimi1,,ik(e)(Sl,Sr)
- ?(e)union(?(elhs), ?(erhs),rk)
19Additive Reduction Example
- Additive reduction example
?(a(i,j)), dimi,j(a(i,j))(ri,rj) ?(b(j)),
dimi,j(b(j))(rj) rj is reduction variable
for i1m, for j1n, d(i)d(i)a(i,j)b(j)
c(i,j) end end
?(c(i,j)), dimi,j(c(i,j))ri,rj
Need to reduce rj c(i,j)?sum(c(i,j),2)
?(a(i,j)b(j)sum(c(i,j),2))rj,
dimi,j(a(i,j)b(j)sum(c(i,j),2)(ri,rj)
Dimensionality and reduced variables agree, now
replace index variables
d(1m)d(1m)a(1m,1n)b(1n)sum(c(1m,1n),2)
20Implementation Prototype
Vectorized Loop
Original Loop
Vectorizer
Embedded Control Statements
Code Generator
Octave Parser
yes
Dimension Check Success
no
no
Create DDG
Vectorize Statement
yes
- Pattern database and corresponding
transformations are specified in modular end-user
extensible manner.
21Results
- Source-to-source transformation
- Timing results averaged over 100 runs
- Platform
- Matlab 7.2.0.283
- 3.0 GHz Pentium D Processor
22Results
Input source
Vectorized Result
hhist(im(),0255)histogram heq255cumsum(h(
))/sum(h()) for i1size(im,1), for
j1size(im,2), im2(i,j)heq(im(i,j)1)
end end
hhist(im(),(0255)) heq255cumsum(h())/sum(
h()) im2(1size(im,1),1size(im,2))...
heq(im(1size(im,1),1size(im,2))1)
- For monochrome 8-bit 800x600 image
- original/vectorized
- Entire routine 0.178s/0.114s (speedup 1.56)
- Loop Portion only 0.0814s/0.0176s (speedup 4.6)
23Results (Menon Pingali Examples)
for k1p, for j1(i-1), X(i,k)X(i,k)-L(i,j)X(
j,k) end end
X(i,1p)X(i,1p)-L(i,1i-1)X(1i-1,1p)
for i1N,for j1N phi(k)phi(k)a(i,j)x_se(i)
f(j) end end
phi(k)phi(k)sum(a(1N,1N)
x_se(1N).f(1N),1)
for i1n,for j1n, for k1n,for l1n
y(i)y(i)x(j)A(i,k)
B(l,k)C(l,j) end end end end
y(1n)y(1n)x(1n)... (A(1n,1n)B(1n,1
n)C(1n,1n))
24Remaining Issues/Future Work
- Each pattern transformation is local no
optimization over entire statement. - e.g., we do not optimize and distribute
transposes - Control flow within loop
- Function calls
- functions are treated as pointwise operators
(correct for many predefined arithmetic
functions) - Incorporate our analysis directly with shape
analysis
25Summary
- Contributions
- A simple method to prevent incorrect
vectorization in Matlab - A user extensible operator/dimensionality pattern
database can be used to improve vectorization - These patterns can make use of higher level
semantics (e.g., matrix multiplication) or
diagonal accesses in vectorization.
26Acknowledgements
- Funding provided by NSERC
- Grateful for reviewers comments and suggestions
27Thank You