Title: Parallel Inversion of Polynomial Matrices
1Parallel Inversion of Polynomial Matrices
Alina Solovyova-Vincent Frederick C. Harris,
Jr. M. Sami Fadali
2Overview
- Introduction
- Existing algorithms
- Buslowiczs algorithm
- Parallel algorithm
- Results
- Conclusions and future work
3Definitions
- A polynomial matrix is a matrix which has
polynomials in all of its entries. - H(s) HnsnHn-1sn-1Hn-2sn-2Ho,
- where Hi are constant r x r matrices,
- i0, , n.
4Definitions
- Example
- s2 s3 3s2s
- s3 s21
- n3 degree of the polynomial matrix
- r2 the size of the matrix H
- Ho H1
5Definitions
- H-1(s) inverse of the matrix H(s)
- One of the ways to calculate it
- H-1(s) adj H(s) /det H(s)
6Definitions
- A rational matrix can be expressed as a ration of
a numerator polynomial matrix and a denominator
scalar polynomial.
7Who Needs It???
- Multivariable control systems
- Analysis of power systems
- Robust stability analysis
- Design of linear decoupling controllers
- and many more areas.
8Existing Algorithms
- Leverriers algorithm ( 1840)
- sI-H - resolvent matrix
- Exact algorithms
- Approximation methods
9The Selection of the Algorithm
- Large degree of polynomial operations
- Lengthy calculations
- Not very general
- Before
- Buslowiczs algorithm (1980)
- After
- Some improvements at the cost of increased
computational complexity
10Buslowiczs Algorithm
- Benefits
- More general than methods proposed earlier
- Only requires operations on constant matrices
- Suitable for computer programming
- Drawback
- the irreducible form cannot be ensured in general
11Details of the Algorithm
12Challenges Encountered (sequential)
- Several inconsistencies in the original paper
13Challenges Encountered (parallel)
for (i2 iltr1 i) calculations
requiring Ri-1k
O(n2r4)
14Challenges Encountered (parallel)
- Loops of variable length
- for(k0 kltni1 k)
-
- for(ll0 llltmin1 ll)
-
- main calculations
-
-
Varies with k
15Shared and Distributed Memory
- Main differences
- Synchronization of the processes
- Shared Memory (barrier)
- Distributed memory (data exchange)
-
for (i2 iltr1 i) calculations requiring Ri-1 Synchronization point
16Platforms
- Distributed memory platforms
- SGI 02 NOW MIPS R5000 180MHz
- P IV NOW 1.8 GHz
- P III Cluster 1GHz
- P IV Cluster Zeon 2.2GHz
17Platforms
- Shared memory platforms
- SGI Power Challenge 10000
- 8 MPIS R10000
- SGI Origin 2000
- 16 MPIS R12000 300MHz
18 Understanding the Results
- n degree of polynomial (lt 25)
- r size of a matrix (lt25)
- Sequential algorithm O(n2r5)
- Average of multiple runs
- Unloaded platforms
19Sequential Run Times (n25, r25)
Platform Times (sec)
SGI O2 NOW 2645.30
P IV NOW 22.94
P III Cluster 26.10
P IV Cluster 18.75
SGI Power Challenge 913.99
SGI Origin 2000 552.95
20Results Distributed Memory
- Speedup
- SGI O2 NOW - slowdown
- P IV NOW - minimal speedup
21Speedup (P III P IV Clusters)
22Results Shared Memory
23Speedup (SGI Power Challenge)
24Speedup (SGI Origin 2000)
Superlinear speedup!
25Run times (SGI Power Challenge)
8 processors
26Run times (SGI Origin 2000)
n 25
27Run times (SGI Power Challenge)
r 20
28Efficiency
2 4 6 8 16 24
P III Cluster 89.7 76.5 61.3 58.5 40.1 25.0
P IV Cluster 88.3 68.2 49.9 46.9 26.1 15.5
SGI Power Challenge 99.7 98.2 97.9 95.8 n/a n/a
SGI Origin 2000 99.9 98.7 99.0 98.2 93.8 n/a
29Conclusions
- We have performed an exhaustive search of all
available algorithms - We have implemented the sequential version of
Buslowiczs algorithm - We have implemented two versions of the parallel
algorithm - We have tested parallel algorithm on 6 different
platforms - We have obtained excellent speedup and efficiency
in a shared memory environment.
30Future Work
- Study the behavior of the algorithm for larger
problem sizes (distributed memory). - Re-evaluate message passing in distributed memory
implementation. - Extend Buslowiczs algorithm to inverting
multivariable polynomial matrices - H(s1, s2 sk).
31Questions