Title: A Case for Source-Level Transformations in MATLAB
1A Case for Source-Level Transformations in MATLAB
Vijay Menon and Keshav Pingali Cornell University
The MaJic Project at Illinois/Cornell
- George Almasi
- Luiz De Rose
- David Padua
2MATLAB
- High-Level Interpreted Language for Numerical
Computing - Matrix is 1st class type
- Library of numerical functions
- Application Domains
- Image Processing
- Structural Mechanics
- Computational Finance
3The Problem
- Development is fast...
- 10X as concise as C/Fortran
- Performance is slow!
- 10X as slow as C/Fortran
- Conventional Approach
- Rewrite
- Compile
4Our Approach Source-Level Optimization
- Apply high-level transformations directly on
MATLAB codes - Significant performance benefit for
- interpreted code
- compiled code
5Outline
- Overheads in MATLAB
- Conventional Compilation
- Source-Level Optimization
- Comparison
- Implementation Status
6Outline
- Overheads in MATLAB
- Type/Shape Checking
- Memory Management
- Array Bounds Checking
- Conventional Compilation
- Source-Level Optimization
- Comparison
- Implementation Status
7Type/Shape Checking
- MATLAB has no type/shape declarations
- Consider
- A B
- Interpreter checks to perform multiply ()
- Shape
- ScalarScalar
- ScalarMatrix
- MatrixMatrix
- Type
- RealReal
- RealComplex
- ComplexComplex
8Type/Shape Checking
- Consider
- for i 1n
- y y a x(i)
- end
- Loops
- perform redundant checks
- magnify interpreter overhead
9Memory Management Dynamic Resizing
- Consider
- x(10) 10
- C/Fortran x must have gt 10 elements
- MATLAB x is resized if needed
- Memory reallocated
- Data copied
10Memory Management Dynamic Resizing
- MATLAB dynamically grows arrays
- for i 1 1000
- x(i) i
- end
- Every iteration triggers resize!
- 1,000 memory allocations
- 500,000 elements copied
- Execution Time
- x is undefined 14.2 seconds
- x is already defined 0.37 seconds
-
11Array Bounds Checking
- Consider array indexing
- x(i) y(i)
- Failed Bounds Check on
- x(i) can trigger resize
- y(i) can trigger error
12Array Bounds Checking
- In a loop
- for i 3100
- x(i) x(i-1) x(i-2)
- end
- Interpreter performance redundant checks
- Compiler work
- Nonresizable arrays Gupta PLDI90
- Resizable arrays more difficult
13Common Theme
- Loops magnify overheads
- every iteration redundant checks, resizes,
- MATLAB interprets naively
- computes as is
- no reorganization to optimize
14Outline
- Overheads in MATLAB
- Conventional Compilation
- Compile to C/Fortran
- Rely on C/Fortran compiler for optimization
- Source-Level Optimization
- Comparison
- Implementation Status
15MATLAB Compilers
- Compile to C/C/Fortran
- MCC -gt C (The MathWorks)
- MATCOM -gt C (Mathtools)
- FALCON -gt F90 (U of Illinois)
- Native compiler generates executable code
- Link back into MATLAB environment
- Run as stand-alone program
16The MCC Compiler
- Safe Optimization
- Type Inference - no declarations in MATLAB
- Eliminate Type Checks / Reduce Storage
- Specialize for real input variables
- Always legal!
- Unsafe Optimization
- Assume all data is real
- Eliminate all bounds checks - disallow resizing
- User must ensure legality!
17Falcon Benchmarks
- Collected by DeRose from MATLAB users at
Illinois/NCSA - Element/Loop Intensive
- CN - Crank-Nicholson PDE Solver
- Di - Dirichlet PDE Solver
- FD - Finite Difference PDE Solver
- Ga - Galerkin PDE Solver
- IC - Incomplete Cholesky Factorization
- Memory Intensive
- AQ - Adaptive Quadrature w/ Simpsons Rule
- EC - Euler-Cromer 2 body problem
- RK - Runga Kutta 2 body problem
- Library Intensive
- CG - Conjugate Gradients Iterative Solver
- Mei - 3D surface Generation
- QMR - Quasi-Minimal Residual
- SOR - Successive Over-Relaxation AQ
18MCC Safe Optimizations
19MCC Unsafe Optimizations
Note User must ensure legality!
20Outline
- Overheads in MATLAB
- Conventional Compilation
- Source-Level Optimization
- Vectorization
- Preallocation
- Expression Optimization
- Comparison
- Implementation Status
21Vectorization
- Loops are expensive
- Overheads are magnified
- Idea Eliminate Loops
- Map loops to higher-level matrix operations
- Interpreter uses efficient libraries
- BLAS
- LINPACK/EISPACK
22Example of Vectorization
- In Galerkin, 98 of execution spent in
- for i 1N
- for j 1N
- phi(k) a(i,j)x(i)y(i)
- end
- end
23Vectorized Code
- In Optimized Galerkin
- phi(k) xay
- Fragment Speedup 260
- Program Speedup 110
- Note Not always possible!
24Effect of Vectorization
25Preallocation
- Eliminate Dynamic Resizing
- Try to predict eventual size of array
- Insert early allocation when possible
- x zeros(1000,1)
- Resizing will not be triggered
26Example of Preallocation
- In Euler-Cromer, 87 of time spent in
- for i 1N
- r(i)
- th(i)
- t(i)
- k(i)
- p(i)
-
- end
27Preallocated Code
- In Optimized Euler-Cromer
- r zeros(1,N)
- ...
- for i 1N
- r(i)
-
- end
- Fragment Speedup 7
- Program Speedup 4
28Effect of Preallocation
29Expression Optimization
- MATLAB interprets expressions naïvely in left to
right order - Simple restructuring may significantly effects
execution time, e.g. - ABx O(n3) flops
- A(Bx) O(n2) flops
30Example of Expression Optimization
- In QMR, 70 of execution spent in
- w Aq
- A 420x420 matrix
- q, w 420x1 vectors
- A transpose(A)
31Expression Optimized Code
- In Optimized QMR Aq (qA)
- w (qA)
- Transpose 2 vectors instead 1 matrix
- Fragment Speedup 20
- Program Speedup 3
32Effect of Expression Optimization
33Summary Source-Level
34Comparison
35Point 1
- Source optimizations can outperform MCC
36Point 2
- Source optimizations complement MCC
37Benefits of Source-Level Optimizations
- Vectorization
- Directly eliminates loop overhead
- Move work to hand-optimized BLAS
- Preallocation
- Eliminates resizing overhead
- Enables MCC array bounds elimination
- Expression Optimization
- Uses algebraic info unavailable in C/Fortran
38Implementation Status
- Illinois/Cornell MaJic system
- Just-in-time MATLAB interpreter/compiler
- Incorporates Source-Level Transformation
- Semantic Optimization (Menon/Pingali ICS99)
- Vectorization/BLAS call generation
- Expression Optimization
- Preallocation/Bounds Check Optimization (Work in
progress)
39Conclusion
- Source Level Optimizations are important for
enhancing performance of MATLAB whether code is
just interpreted or later compiled
40THE END
41Unsafe Type Check Removal
42Unsafe Bounds Check Removal