Title: A Parallel Path Tracker in PHCpack
1A Parallel Path Tracker in PHCpack
- Yusong Wang
- Department of Math, Stat CS
- University of Illinois at Chicago
- E-mail ywang25_at_uic.edu
- SIAM PP04, San Francisco
- February 26, 2004
- Joint work with Jan Verschelde
2Introduction
- 1 Numerical Homotopy Algorithm
- If we wish to solve f(x) 0, then we
construct another polynomial g(x) 0 whose
solutions are known. - Consider the homotopy
- H(x,t) (1 - t)g(x) t f(x) 0.
- By continuation, we trace the paths at the
known solutions of g(x) 0 to the desired
solutions of f(x) 0, for t from 0 to 1. - T.Y. Li. Numerical solution of polynomial
systems by homotopy continuation methods. - Handbook of Numerical Analysis. Volume XI.
- pages 209--304, 2003.
3- 2 Why do we need a parallel path tracker?
- For large problems, the computational workload
increases dramatically. - Example It takes 8 hours to solve the cyclic
10-roots problem on a 1 Ghz computer.
B) The homotopy algorithm is well suited for
parallel computing, since each processor can
follow its own paths independently
A. Chakraborty, D.C.S. Allison, C.J. Ribbens, and
L.T. Watson. The parallel complexity of embedding
algorithms for the solution of systems of
nonlinear equations. IEEE Transactions on
Parallel and Distributed Systems. 4(4), 1993.
4Architecture of the Program
Master reads f(x), g(x) and start roots from a
file
Master broadcasts the f(x), g(x) to all the slave
nodes
Slaves construct H(x,t) r(1 - t)g(x) t f(x)
0
Master distributes jobs (start roots) to the
slave nodes
Static workload balance
Dynamic workload balance
Slaves compute the target solutions with the
path tracker in PHCpack?
?ACM Transactions on Mathematical Software
Algorithm 795
Slaves return solutions to master and print out
5Static and Dynamic Workload Balance
Static Workload Balance
Dynamic Workload Balance
6- Dynamic workload balance could
- be more efficient for some case.
In the static case, the processors are not fully
utilized. Some processors may finish their job
earlier, but still need to wait others to finish
their jobs.
In the dynamic case, the utilization of the
processors is much better, so is the efficiency.
7Experiments Results
A) Cyclic 10-roots problem on the Platinum
Cluster at NCSA
8B) An application from Mechanism Design
The example comes from the geometric design of a
five degree-of-freedom robot. H.J. Su and J.M.
McCarthy. Kinematic Synthesis of RPS Serial
Chains. Proceedings of the ASME Design
Engineering Technical Conferences. Paper
DETC03/DAC-48813. Chicago, IL, Sept.02-06, 2003.
The RPS serial chain
- ? Ten polynomials in ten unknowns
- ? 9,216 paths to track, more than 8,000 paths
diverge - ? It takes 24 hours with the sequential version
of PHCpack on a 2.4Ghz Pentium machine .
? It only takes 22 minutes with the parallel path
tracker on 128 1Ghz CPUs of the Platinum cluster
at NCSA.
9Comparison of the Static and Dynamic Workload
Balance with the RPS problem
10Why the improvement is not so obvious in this
example?
- In this example, there is no large variance in
computing times for the static model. - The dynamic workload balance has more
communication overhead but it is less significant
compared with the computing time.
11Conclusions
- When the number of diverging paths is modest,
there is a large variance in computing times
among processors in the static model. Then the
dynamic workload balancing approach will be more
advantageous.