TECHNIQUES - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

TECHNIQUES

Description:

techniques algorithms and software for the direct solution of large sparse linear equations – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 51
Provided by: Kathr121
Category:
Tags: techniques | algor

less

Transcript and Presenter's Notes

Title: TECHNIQUES


1
TECHNIQUES ALGORITHMS AND SOFTWARE FOR
THE DIRECT SOLUTION OF LARGE SPARSE LINEAR
EQUATIONS
2
Wish to solve Ax b A
is LARGE and S P A R S E
3
A LARGE . . . ORDER n(t)
t
n 1970 200 1975 1
000 1980 10 000 1985 100
000 1990 250 000 1995 500
000 2000 2 000 000 SPARSE . . . NUMBER
ENTRIES kn
k 2 - log n
4
NUMERICAL APPLICATIONS Stiff ODEs BDF Sparse
Jacobian Linear Programming Simplex Optimization
/Nonlinear Equations Elliptic Partial
Differential Equations Eigensystem Solution Two
Point Boundary Value Problems Least Squares
Calculations
5
APPLICATION AREAS Physics CFD
Lattice gauge Ato
mic spectra Chemistry Quantum
chemistry Chemical
engineering Civil Engineering Structural
analysis Electrical Engineering Power
systems Circuit
simulation Device
simulation Geography Geodesy Demograph
y Migration Economics Economic
modelling Behavioural sciences Industrial
relations Politics Trading Psycholo
gy Social dominance Business
administration Bureaucracy Operations
research Linear Programming
6
THERE FOLLOWS PICTURES OF SPARSE MATRICES FROM
VARIOUS APPLICATIONS This is done to illustrate
different structures for sparse matrices
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
STANDARD SETS OF SPARSE MATRICES Original
set Harwell-Boeing Sparse Matrix
Collection Available by anonymous ftp
from ftp.numerical.rl.ac.uk or from the Web
page http//www.cse.clrc.ac.uk/Activity/SparseMat
rices Extended set of test matrices available
from Tim Davis http//www.cise.ufl.edu/research/
sparse/matrices
15
  • RUTHERFORD-BOEING SPARSE MATRIX TEST COLLECTION
  • Unassembled finite-element matrices
  • Least-squares problems
  • Unsymmetric eigenproblems
  • Large unsymmetric matrices
  • Very large symmetric matrices
  • Singular Systems
  • See Report
  • The Rutherford-Boeing Sparse Matrix Collection
  • Iain S Duff, Roger G Grimes and John G Lewis
  • Report RAL-TR-97-031, Rutherford Appleton
    Laboratory
  • http//www.numerical.rl.ac.uk/reports/reports
  • To be distributed via matrix market
  • http//math.nist.gov/MatrixMarket

16
RUTHERFORD - BOEING SPARSE MATRIX COLLECTION Ano
nymous FTP to FTP.CERFACS.FR cd
pub/algo/matrices/rutherford_boeing OR http//www.
cerfacs.fr/algor/Softs/index.html
17
A x b has solution x A-1 b
18
DIRECT METHODS Gaussian Elimination PAQ ?
LU Permutations P and Q chosen to preserve
sparsity and maintain stability L Lower
triangular (sparse) U Upper triangular
(sparse) SOLVE Ax b by Ly Pb then UQx y
19
  • PHASES OF SPARSE DIRECT SOLUTION
  • Although the exact subdivision of tasks for
    sparse direct solution will depend on the
    algorithm and software being used, a common
    subdivision is given by
  • ANALYSE An analysis phase where the matrix
  • is analysed to produce a suitable ordering
    and
  • data structures for efficient factorization.
  • FACTORIZE A factorization phase where the
  • numerical factorization is performed.
  • SOLVE A solve phase where the factors are used
    to
  • solve the system using forward and backward
  • substitution.
  • We note the following
  • ANALYSE is sometimes preceded by a
  • preordering phase to exploit structure.
  • For general unsymmetric systems, the ANALYSE
  • and FACTORIZE phases are sometimes combined
  • to ensure the ordering does not compromise
  • stability.

20
STEPS MATCH SOLUTION REQUIREMENTS 1. One-off
solution Ax b A/F/S 2.
Sequence of solutions Matrix changes but
structure is invarient A1x1 b1 A2x2 b2
A/F/S/F/S/F/S A3x3 b3 3. Sequence of
solutions Same matrix Ax1 b1 Ax2
b2 A/F/S/S/S Ax3 b3 For example . 2.
Solution of nonlinear system A is Jacobian 3.
Inverse iteration
21
TIMES (MA48) FOR TYPICAL EXAMPLE Seconds on
CRAY YMP matrix GRE 1107 ANALYSE/FACTORIZE .
66 FACTORIZE .075 SOLVE .0068
22
  • DIRECT METHODS
  • Easy to package
  • High accuracy
  • Method of choice in many applications
  • Not dramatically affected by
  • conditioning
  • Reasonably independent of structure
  • However
  • High time and storage requirement
  • Typically limited to n 1000000
  • So
  • Use on subproblem
  • Use as preconditioner

23
COMBINING DIRECT AND ITERATIVE METHODS can be
thought of as sophisticated preconditioning. Mult
igrid using direct method as coarse grid
solver. Domain Decomposition
using direct method on local
subdomains and direct
preconditioner on
interface. Block Iterative Methods
direct solver on sub-blocks Partial
Factorization as Preconditioner Full
Factorization of Nearby Problem as a
Preconditioner.
24
  • FOR EFFICIENT SOLUTION OF SPARSE EQUATIONS WE
    MUST
  • Only store nonzeros (or exceptionally a
  • few zeros also)
  • Only perform arithmetic with nonzeros
  • Preserve sparsity during computation

25
COMPLEXITY of LU factorization on dense matrices
of order n 2/3 n3 0(n2) flops n2 storage For
band (order n, bandwidth k) 2 k2n work, nk
storage Five-diagonal matrix (on k x k grid) 0
(k3) work and 0 (k2 log k) storage Tridiagonal
Arrowhead matrix 0 (n) work storage Target 0
(n) 0 (?) for sparse matrix of order n with ?
entries
26
(No Transcript)
27
GRAPH THEORY We form a graph on n vertices
associated with our matrix of order n. Edge (i,
j) exists in the graph if and only if entry aij
(and, by symmetry aji) in the matrix is
non-zero. Example x x x x x
x x x
x x x x x
x Symmetric Matrix and Associated Graph
28
MAIN BENEFITS OF USING GRAPH THEORY IN STUDY OF
SYMMETRIC GAUSSIAN ELIMINATION 1. Structure of
graph is invariant under symmetric
permutations of the matrix (corresponds to re-
labelling of vertices). 2. For mesh
problems, there is usually an equivalence
between the mesh and the graph associated
with the resulting matrix. We thus work
directly with the underlying structure. 3. We
can represent cliques in graphs by listing
vertices in a clique without storing all the
interconnecting edges.
29
(No Transcript)
30
(No Transcript)
31
COMPLEXITY FOR MODEL PROBLEM Model Problem with
n variables Ordering Time 0 (1) .. 0
(n) Symbolic 0 (n) Factorization Numerical 0
(n3/2) Solve 0 (n log2 n) and big 0 is not so
big
32
Ordering for sparsity unsymmetric matrices
x x x v v x v v x v v ri 3, cj
4 Minimize (ri - 1) (cj - 1) MARKOWITZ
ORDERING (Markowitz 1957) Choose nonzero
satisfying numerical criterion with best
Markowitz count
33
(No Transcript)
34
Benefits of Sparsity on Matrix of Order 2021 with
7353 entries
Total
Storage Flops Time (secs) Procedure
(Kwords) (106) CRAY
J90 Treating system as dense 4084
5503 34.5 Storing and operating
71 1073 3.4 only on
nonzero entries Using sparsity pivoting
14 42 0.9
35
An example of a code that uses Markowitz and
Threshold Pivoting is the HSL Code MA48
36
Comparison between sparse (MA48 from HSL) and
dense (SGESV from LAPACK). Times for
factorization/solution in seconds on CRAY Y-MP.
37
  • HARWELL SUBROUTINE LIBRARY
  • Started at Harwell in 1963
  • Most routines from research work of group
  • Particular strengths in
  • - Sparse Matrices
  • - Optimization
  • - Large-Scale system solution
  • Since 1990 Collaboration between RAL and
  • Harwell
  • HSL 2000 . . . October 2000
  • URL http//www.cse.clrc.ac.uk/Activity/HSL

38
  • Wide Range of Sparse Direct Software
  • Markowitz Threshold MA48
  • Band - Variable Band (Skyline) MA55
  • Frontal
  • - Symmetric MA62
  • - Unsymmetric MA42
  • Multifrontal
  • - Symmetric MA57 and MA67
  • - Unsymmetric MA41
  • - Very Unsymmetric MA38
  • - Unsymmetric element MA46
  • Supernodal SUPERLU

39
  • Features of MA48
  • Uses Markowitz with threshold privoting
  • Factorizes rectangular and singular systems
  • Block Triangularization
  • - Done at higher level
  • - Internal routines work only on single
  • block
  • Switch to full code at all phases of solution
  • Three factorize entries
  • - Only pivot order is input
  • - Fast factorize .. Uses structures
    generated
  • in first factorize
  • - Only factorizes later columns of matrix
  • Iterative refinement and error estimator in
  • Solve phase
  • Can employ drop tolerance
  • Low storage in analysis phase

40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
TWO PROBLEMS WITH MARKOWITZ THRESHOLD 1.
Code Complexity ? Implementation
Efficiency 2. Exploitation of
Parallelism
44
TYPICAL INNERMOST LOOP DO 590 JJ J1,
J2 J ICN (JJ) IF (IQ(J). GT.0) GO TO
590 IOP IOP 1 PIVROW IJPOS - IQ
(J) A(JJ) A(JJ) AU x A (PIVROW) IF (LBIG)
BIG DMAXI (DABS(A(JJ)), BIG) IF (DABS (A(JJ)).
LT. TOL) IDROP IDROP 1 ICN (PIVROW)
ICN (PIVROW) 590 CONTINUE
45
  • INDIRECT ADDRESSING
  • Increases memory traffic
  • Results in bad data distribution and non-
  • localization of data
  • And non-regularity of access
  • Suffers from short vector lengths

46
SOLUTION Isolate indirect addressing W(IND(I))
from inner loops(s). Use full/dense matrix
kernels (Level 2 3 BLAS) in innermost loops.
47
MAIN PROBLEM WITH PARALLEL IMPLEMENTATION IS
THAT SPARSE DIRECT TECHNIQUES 1. REDUCE
GRANULARITY 2. INCREASE SYNCHRONIZATION 3.
INCREASE FRAGMENTATION OF DATA
48
PARALLEL IMPLEMENTATION X X
X X
X X
Obtain by choosing entries so that no two lie in
same row or column. Then do rank k update.
49
TECHNIQUES FOR USING LEVEL 3 DENSE BLAS IN
SPARSE FACTORIZATION Frontal methods (1971)
Multifrontal methods (1982) Supernodal
methods (1985)
All pre-dated BLAS in 1990
50
BLAST BLAS TECHNICAL FORUM Web
Page www.netlib.org/blas/blast-forum has
both Dense and Sparse BLAS
Write a Comment
User Comments (0)
About PowerShow.com