Title: Parallel Implementation of the Inversion of Polynomial Matrices
1Parallel Implementation of the Inversion of
Polynomial Matrices
A thesis submitted in partial fulfillment of the
requirements for the degree of Master of Science
with a major in Computer Science.
Alina Solovyova-Vincent
March 26, 2003
2Acknowledgments
- I would like to thank Dr. Harris for his generous
help and support. - I would like to thank my committee members, Dr.
Kongmunvattana and - Dr. Fadali for their time and helpful comments.
3Overview
- Introduction
- Existing algorithms
- Buslowiczs algorithm
- Parallel algorithm
- Results
- Conclusions and future work
4Definitions
- A polynomial matrix is a matrix which has
polynomials in all of its entries. - H(s) HnsnHn-1sn-1Hn-2sn-2Ho,
- where Hi are constant r x r matrices,
- i0, , n.
5Definitions
- Example
- s2 s3 3s2s
- s3 s21
- n3 degree of the polynomial matrix
- r2 the size of the matrix H
- Ho H1
6Definitions
- H-1(s) inverse of the matrix H(s)
- One of the ways to calculate it
- H-1(s) adj H(s) /det H(s)
7Definitions
- A rational matrix can be expressed as a ration of
a numerator polynomial matrix and a denominator
scalar polynomial.
8Who Needs It???
- Multivariable control systems
- Analysis of power systems
- Robust stability analysis
- Design of linear decoupling controllers
- and many more areas.
9Existing Algorithms
- Leverriers algorithm ( 1840)
- sI-H - resolvent matrix
- Exact algorithms
- Approximation methods
10The Selection of the Algorithm
- Large degree of polynomial operations
- Lengthy calculations
- Not very general
- Before
- Buslowiczs algorithm (1980)
- After
- Some improvements at the cost of increased
computational complexity
11Buslowiczs Algorithm
- Benefits
- More general than methods proposed earlier
- Only requires operations on constant matrices
- Suitable for computer programming
- Drawback
- the irreducible form cannot be ensured in general
12Details of the Algorithm
13Challenges Encountered (sequential)
- Several inconsistencies in the original paper
14Challenges Encountered (parallel)
for (i2 iltr1 i) calculations
requiring Ri-1k
O(n2r4)
15Challenges Encountered (parallel)
- Loops of variable length
- for(k0 kltni1 k)
-
- for(ll0 llltmin1 ll)
-
- main calculations
-
-
Varies with k
16Shared and Distributed Memory
- Main differences
- Synchronization of the processes
- Shared Memory (barrier)
- Distributed memory (data exchange)
-
for (i2 iltr1 i) calculations requiring Ri-1 Synchronization point
17Platforms
- Distributed memory platforms
- SGI 02 NOW MIPS R5000 180MHz
- P IV NOW 1.8 GHz
- P III Cluster 1GHz
- P IV Cluster Zeon 2.2GHz
18Platforms
- Shared memory platforms
- SGI Power Challenge 10000
- 8 MPIS R10000
- SGI Origin 2000
- 16 MPIS R12000 300MHz
19 Understanding the Results
- n degree of polynomial (lt 25)
- r size of a matrix (lt25)
- Sequential algorithm O(n2r5)
- Average of multiple runs
- Unloaded platforms
20Sequential Run Times (n25, r25)
Platform Times (sec)
SGI O2 NOW 2645.30
P IV NOW 22.94
P III Cluster 26.10
P IV Cluster 18.75
SGI Power Challenge 913.99
SGI Origin 2000 552.95
21Results Distributed Memory
- Speedup
- SGI O2 NOW - slowdown
- P IV NOW - minimal speedup
22Speedup (P III P IV Clusters)
23Results Shared Memory
24Speedup (SGI Power Challenge)
25Speedup (SGI Origin 2000)
Superlinear speedup!
26Run times (SGI Power Challenge)
8 processors
27Run times (SGI Origin 2000)
n 25
28Run times (SGI Power Challenge)
r 20
29Efficiency
2 4 6 8 16 24
P III Cluster 89.7 76.5 61.3 58.5 40.1 25.0
P IV Cluster 88.3 68.2 49.9 46.9 26.1 15.5
SGI Power Challenge 99.7 98.2 97.9 95.8 n/a n/a
SGI Origin 2000 99.9 98.7 99.0 98.2 93.8 n/a
30Conclusions
- We have performed an exhaustive search of all
available algorithms - We have implemented the sequential version of
Buslowiczs algorithm - We have implemented two versions of the parallel
algorithm - We have tested parallel algorithm on 6 different
platforms - We have obtained excellent speedup and efficiency
in a shared memory environment.
31Future Work
- Study the behavior of the algorithm for larger
problem sizes (distributed memory). - Re-evaluate message passing in distributed memory
implementation. - Extend Buslowiczs algorithm to inverting
multivariable polynomial matrices - H(s1, s2 sk).
32Questions