Parallel Implementation of the Inversion of Polynomial Matrices - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel Implementation of the Inversion of Polynomial Matrices

Description:

I would like to thank Dr. Harris for his generous help and support. ... Future Work. Study the behavior of the algorithm for larger problem sizes (distributed memory) ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 33
Provided by: cse52
Learn more at: https://www.cse.unr.edu
Category:

less

Transcript and Presenter's Notes

Title: Parallel Implementation of the Inversion of Polynomial Matrices


1
Parallel Implementation of the Inversion of
Polynomial Matrices
A thesis submitted in partial fulfillment of the
requirements for the degree of Master of Science
with a major in Computer Science.
Alina Solovyova-Vincent
March 26, 2003
2
Acknowledgments
  • I would like to thank Dr. Harris for his generous
    help and support.
  • I would like to thank my committee members, Dr.
    Kongmunvattana and
  • Dr. Fadali for their time and helpful comments.

3
Overview
  • Introduction
  • Existing algorithms
  • Buslowiczs algorithm
  • Parallel algorithm
  • Results
  • Conclusions and future work

4
Definitions
  • A polynomial matrix is a matrix which has
    polynomials in all of its entries.
  • H(s) HnsnHn-1sn-1Hn-2sn-2Ho,
  • where Hi are constant r x r matrices,
  • i0, , n.

5
Definitions
  • Example
  • s2 s3 3s2s
  • s3 s21
  • n3 degree of the polynomial matrix
  • r2 the size of the matrix H
  • Ho H1
  • 2 0
  • 0 1
  • 1
  • 0 0

6
Definitions
  • H-1(s) inverse of the matrix H(s)
  • One of the ways to calculate it
  • H-1(s) adj H(s) /det H(s)

7
Definitions
  • A rational matrix can be expressed as a ration of
    a numerator polynomial matrix and a denominator
    scalar polynomial.

8
Who Needs It???
  • Multivariable control systems
  • Analysis of power systems
  • Robust stability analysis
  • Design of linear decoupling controllers
  • and many more areas.

9
Existing Algorithms
  • Leverriers algorithm ( 1840)
  • sI-H - resolvent matrix
  • Exact algorithms
  • Approximation methods

10
The Selection of the Algorithm
  • Large degree of polynomial operations
  • Lengthy calculations
  • Not very general
  • Before
  • Buslowiczs algorithm (1980)
  • After
  • Some improvements at the cost of increased
    computational complexity

11
Buslowiczs Algorithm
  • Benefits
  • More general than methods proposed earlier
  • Only requires operations on constant matrices
  • Suitable for computer programming
  • Drawback
  • the irreducible form cannot be ensured in general

12
Details of the Algorithm
  • Available upon request

13
Challenges Encountered (sequential)
  • Several inconsistencies in the original paper

14
Challenges Encountered (parallel)
  • Dependent loops

for (i2 iltr1 i) calculations
requiring Ri-1k
  • for(k0 kltni1 k)

O(n2r4)
15
Challenges Encountered (parallel)
  • Loops of variable length
  • for(k0 kltni1 k)
  • for(ll0 llltmin1 ll)
  • main calculations

Varies with k
16
Shared and Distributed Memory
  • Main differences
  • Synchronization of the processes
  • Shared Memory (barrier)
  • Distributed memory (data exchange)

for (i2 iltr1 i) calculations requiring Ri-1 Synchronization point
17
Platforms
  • Distributed memory platforms
  • SGI 02 NOW MIPS R5000 180MHz
  • P IV NOW 1.8 GHz
  • P III Cluster 1GHz
  • P IV Cluster Zeon 2.2GHz

18
Platforms
  • Shared memory platforms
  • SGI Power Challenge 10000
  • 8 MPIS R10000
  • SGI Origin 2000
  • 16 MPIS R12000 300MHz

19
Understanding the Results
  • n degree of polynomial (lt 25)
  • r size of a matrix (lt25)
  • Sequential algorithm O(n2r5)
  • Average of multiple runs
  • Unloaded platforms

20
Sequential Run Times (n25, r25)
Platform Times (sec)
SGI O2 NOW 2645.30
P IV NOW 22.94
P III Cluster 26.10
P IV Cluster 18.75
SGI Power Challenge 913.99
SGI Origin 2000 552.95
21
Results Distributed Memory
  • Speedup
  • SGI O2 NOW - slowdown
  • P IV NOW - minimal speedup

22
Speedup (P III P IV Clusters)
23
Results Shared Memory
  • Excellent results!!!

24
Speedup (SGI Power Challenge)
25
Speedup (SGI Origin 2000)
Superlinear speedup!
26
Run times (SGI Power Challenge)
8 processors
27
Run times (SGI Origin 2000)
n 25
28
Run times (SGI Power Challenge)
r 20
29
Efficiency
2 4 6 8 16 24
P III Cluster 89.7 76.5 61.3 58.5 40.1 25.0
P IV Cluster 88.3 68.2 49.9 46.9 26.1 15.5
SGI Power Challenge 99.7 98.2 97.9 95.8 n/a n/a
SGI Origin 2000 99.9 98.7 99.0 98.2 93.8 n/a
30
Conclusions
  • We have performed an exhaustive search of all
    available algorithms
  • We have implemented the sequential version of
    Buslowiczs algorithm
  • We have implemented two versions of the parallel
    algorithm
  • We have tested parallel algorithm on 6 different
    platforms
  • We have obtained excellent speedup and efficiency
    in a shared memory environment.

31
Future Work
  • Study the behavior of the algorithm for larger
    problem sizes (distributed memory).
  • Re-evaluate message passing in distributed memory
    implementation.
  • Extend Buslowiczs algorithm to inverting
    multivariable polynomial matrices
  • H(s1, s2 sk).

32
Questions
Write a Comment
User Comments (0)
About PowerShow.com