Graph Algorithms in Numerical Linear Algebra: Past, Present, and Future presentation

About This Presentation

Transcript and Presenter's Notes

Title: Graph Algorithms in Numerical Linear Algebra: Past, Present, and Future

1
Graph Algorithms in Numerical Linear Algebra
Past, Present, and Future

John R. Gilbert
MIT and UC Santa Barbara
September 28, 2002

2
Outline

Past (one extended example)
Present (two quick examples)
Future (6 or 8 examples)

3
A few themes

Paths
Locality
Eigenvectors
Huge data sets
Multiple scales

4
PAST
5
Graphs and sparse Gaussian elimination (1961-)
Fill new nonzeros in factor
Cholesky factorization for j 1 to n add
edges between js higher-numbered neighbors
G(A)chordal
G(A)
6
Fill-reducing matrix permutations

Theory approx optimal separators gt approx
optimal fill and flop count
Orderings nested dissection, minimum degree,
hybrids
Graph partitioning spectral, geometric,
multilevel

7
Directed graph
A
G(A)

A is square, unsymmetric, nonzero diagonal
Edges from rows to columns
Symmetric permutations PAPT

8
Symbolic Gaussian elimination
A
G (A)

Add fill edge a -gt b if there is a path from a to
b through lower-numbered vertices.

9
Strongly connected components
G(A)
PAPT

Symmetric permutation to block triangular form
Solve Axb by block back substitution
Irreducible (strong Hall) diagonal blocks
Row and column partitions are independent of
choice of nonzero diagonal
Find P in linear time by depth-first search

10
Sparse-sparse triangular solve
G(LT)
L
x
b

Symbolic
Predict structure of x by depth-first search from
nonzeros of b
Numeric
Compute values of x in topological order

Time O(flops)
11
Column intersection graph
A
G?(A)
ATA

G?(A) G(ATA) if no cancellation (otherwise
?)
Permuting the rows of A does not change G?(A)

12
Filled column intersection graph
A
chol(ATA)

G?(A) symbolic Cholesky factor of ATA
In PALU, G(U) ? G?(A) and G(L) ? G?(A)
Tighter bound on L from symbolic QR
Bounds are best possible if A is strong Hall

13
Column elimination tree
T?(A)
A
chol(ATA)

Elimination tree of ATA (if no cancellation)
Depth-first spanning tree of G?(A)
Represents column dependencies in various
factorizations

14
That story continues . . .

Matlab 4.0, 1992
Left-looking column-by-column LU factorization
Depth-first search to predict structure of each
column
Slow search limited speed
BLAS-1 limited cache reuse
. . .
SuperLU supernodal BLAS-2.5 LU
UMFPACK multifrontal BLAS-3 LU Matlab 6.5,
2002
Ordering for nonsymmetric LU is still not well
understood

15
PRESENT
16
Support graph preconditioning
http//www.cs.sandia.gov/bahendr/support.html

Define a preconditioner B for matrix A
Explicitly compute the factorization B LU
Choose nonzero structure of B to make factoring
cheap (using combinatorial tools from direct
methods)
Prove bounds on condition number using both
algebraic and combinatorial tools

17
Support graph preconditioner example
Vaidya
G(A)
G(B)

A is symmetric positive definite with negative
off-diagonal nzs
B is a maximum-weight spanning tree for A (with
diagonal modified to preserve row sums)
Preconditioning costs O(n) time per iteration
Eigenvalue bounds from graph embedding product
of congestion and dilation
Condition number at most n2 independent of
coefficients
Many improvements exist

18
Support graph preconditioner example
Vaidya
G(A)
G(B)

can improve congestion and dilation by adding a
few strategically chosen edges to B
cost of factorsolve is O(n1.75), or O(n1.2) if A
is planar
in experiments by Chen Toledo, often better
than drop-tolerance MIC for 2D problems, but not
for 3D.

19
Algebraic framework
Gremban/Miller/Boman/Hendrickson

The support of B for A is s(A, B) mint
xT(tB A)x ? 0 for all x, all t ? t
In the SPD case, s(A, B) max? Ax ?Bx
?max(A, B)
Theorem 1 If A, B are SPD then ?(B-1A)
s(A, B) s(B, A)
Theorem 2 If VWU, then s(UUT, VVT)
? W22

20
Algebraic framework
Gremban/Miller/Boman/Hendrickson

Lemma If VWU, then s(UUT, VVT) ?
W22
Proof
take t ? W22 ?max(WWT)
max x?0 xTWWTx / xTx
then xT (tI - WWT) x ? 0 for all x
letting x VTy gives yT (tVVT - UUT) y
? 0 for all y
recall s(A, B) mint xT(tB A)x ? 0 for
all x, all t ? t
thus s(UUT, VVT) ? t

21
s(A, B) ? W22 ? W? x W1
(max row sum) x (max col sum) ? (max
congestion) x (max dilation)
22
Open problems I

Other subgraph constructions for better bounds on
W22 ?
For example Boman,
W22 ? WF2 sum(wij2) sum of
(weighted) dilations,
and Alon, Karp, Peleg, West show there exists a
spanning tree with average weighted dilation
exp(O((log n loglog n)1/2)) o(n? )
this gives condition number O(n1?) and solution
time O(n1.5?),
compared to Vaidya O(n1.75) with augmented
spanning tree
Is there a construction that minimizes W22
directly?

23
Open problems II

Make spanning tree methods more effective in 3D?
Vaidya O(n1.75) in general, O(n1.2) in 2D
Issue 2D uses bounded excluded minors, not just
separators
Analyze a multilevel method in general?
Extend to more general finite element matrices?

24
Link analysis of the world-wide web

Web page vertex
Link directed edge
Link matrix Aij 1 if page i links to page j

25
Web graph PageRank (Google) Brin,
Page
An important page is one that many important
pages point to.

Markov process follow a random link most of the
time otherwise, go to any page at random.
Importance stationary distribution of Markov
process.
Transition matrix is pA (1-p)ones(size(A)),
scaled so each column sums to 1.
Importance of page i is the i-th entry in the
principal eigenvector of the transition matrix.
But, the matrix is 2,000,000,000 by 2,000,000,000.

26
Web graph Hubs and authorities
Kleinberg
Authorities trekbikes.com shimano.com campagnol
o.com
Hubs bikereviews.com phillybikeclub.org yahoo.c
om/cycling
A good hub cites good authorities A good
authority is cited by good hubs

Each page has hub score xi and authority score yi
Repeat y ? Ax x ? ATy normalize
Converges to principal eigenvectors of AAT and
ATA(left and right singular vectors of A)

27
FUTURE
28
Combinatorial issues in numerical linear algebra

Approximation theory for nonsymmetric LU ordering
Preconditioning
Complexity of matrix multiplication

29
Biology as an information science

The rate-limiting step of genomics is
computational science. - Eric
Lander
Sequence assembly, gene identification, alignment
Large-scale data mining
Protein modeling discrete and continuous

30
Linear and sublinear algorithms for huge problems

Can we understand anything interesting about
our data when we do not even have time to read
all of it? - Ronitt
Rubinfeld

31
Fast Monte Carlo algorithms for finding low-rank
approximations Frieze, Kannan,
Vempala

Describe a rank-k matrix B0 that is within e of
the best rank-k approximation to A
A B0F ? minB A - BF e AF
Correct with probability at least 1 d
Time polynomial in k, 1/e, log(1/d) independent
of size of A
Idea using a clever distribution, sample an
O(k-by-k) submatrix of A and compute its SVD
(Need to be able to sample A with the right
distribution)

32
Approximating the MST weight in sublinear time
Chazelle, Rubinfeld, Trevisan

Key subroutine estimate number of connected
components of a graph, in time depending on
expected error but not on size of graph
Idea for each vertex v define
f(v) 1/(size of component containing v)
Then Sv f(v) number of connected components
Estimate Sv f(v) by breadth-first search from a
few vertices

33
Approximate number of connected components

for each of r randomly chosen vertices u1 ... ur
do
breadth-first search to distance 1 from ui
repeat until bi is set
if coin flip yields heads or BFS has reached at
least w vertices
then set bi 0 and break repeat
continue breadth-first search to double
vertices reached
if BFS has reached all vertices in component
containing ui
then set bi 2(flips) / (vtxs reached) and
break repeat
end repeat
end for
return (n/r) sum (bi)

34
Managing data and computation on the web

Mapping, balancing, scheduling for parallel
computing
Data placement, compression, communication
optimization
Sophisticated cacheing (Akamai)
Internet backplane (UTK)
Active buffering, compressed migration (UIUC)

35
Modeling and distributed control
Imagining a molecular-scale crossbar switch
Heath et al., UCLA

Multiresolution modeling for nanomaterials and
nanosystems
Distributed control for systems on micro to
global scales

36
Active surface airjet paper mover
Berlin, Biegelsen et al., PARC
PC-hosted DSP control _at_ 1 kHz
12 x 12 board
sensors 32,000 gray-level pixels in 25 linear
arrays
576 valves (144 per direction)
37
A hard question

How will combinatorial methods be used by people
who dont understand them in detail?

38
Matrix division in Matlab

x A \ b
Works for either full or sparse A
Is A square?
no gt use QR to solve least squares problem
Is A triangular or permuted triangular?
yes gt sparse triangular solve
Is A symmetric with positive diagonal elements?
yes gt attempt Cholesky after symmetric minimum
degree
Otherwise
gt use LU on A(, colamd(A))

39
Matching and depth-first search in Matlab

dmperm Dulmage-Mendelsohn decomposition
Bipartite matching followed by strongly connected
components
Square, full rank A
p, q, r dmperm(A)
A(p,q) has nonzero diagonal and is in block upper
triangular form
also, strongly connected components of a directed
graph
also, connected components of an undirected graph
Arbitrary A
p, q, r, s dmperm(A)
maximum-size matching in a bipartite graph
minimum-size vertex cover in a bipartite graph
decomposition into strong Hall blocks

40
A few themes

Paths
Locality
Eigenvectors
Huge data sets
Multiple scales
Usability

41
Morals

Things are clearer if you look at them from two
directions
Combinatorial algorithms are pervasive in
scientific computing and will become more so
What are the implications for teaching?
What are the implications for software
development?

42
Thanks

Patrick Amestoy, Erik Boman, Iain Duff, Mary Ann
Branch Freeman, Bruce Hendrickson, Esmond Ng,
Alex Pothen, Padma Raghavan, Ronitt Rubinfeld,
Rob Schreiber, Sivan Toledo, Paul Van Dooren,
Steve Vavasis, Santosh Vempala, ...

Write a Comment

User Comments (0)

About PowerShow.com

Graph Algorithms in Numerical Linear Algebra: Past, Present, and Future PowerPoint PPT Presentation