CS 290H Lecture 11 BLAS, Supernodes, and SuperLU

About This Presentation

Title:

CS 290H Lecture 11 BLAS, Supernodes, and SuperLU

Description:

... factorization than GP (~4x) ... Supernode = group of adjacent columns of L with ... over GP column-column. 22 matrices: Order 765 to 76480; GP factor time ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 10

Provided by: JohnGi84

Category:

more less

Transcript and Presenter's Notes

Title: CS 290H Lecture 11 BLAS, Supernodes, and SuperLU

1
CS 290H Lecture 11BLAS, Supernodes, and SuperLU

Read SuperLU_DIST A scalable distributed-memory
sparse direct solver for unsymmetric linear
systems (reader 5)
Homework 3 due Sunday 21 November
No class next Tue 9 Nov (SC 2004) or Thu 11 Nov
(holiday)
If you havent told me what your final project
is, do so ASAP
See Kathy Yelicks slides on matrix
multiplication and BLAS

2
Left-looking Column LU Factorization

for column j 1 to n do
solve
pivot swap ujj and an elt of lj
scale lj lj / ujj

Column j of A becomes column j of L and U

3
Symmetric Pruning
Eisenstat, Liu
Idea Depth-first search in a sparser graph with
the same path structure

Use (just-finished) column j of L to prune
earlier columns
No column is pruned more than once
The pruned graph is the elimination tree if A is
symmetric

4
GP-Mod Algorithm
Matlab 5

Left-looking column-by-column factorization
Depth-first search to predict structure of each
column
Symmetric pruning to reduce symbolic cost

Much cheaper symbolic factorization than GP
(4x) - Indirect addressing for each flop
(sparse vector kernel) - Poor reuse of data in
cache (BLAS-1 kernel) gt Supernodes
5
Symmetric supernodes for Cholesky GLN section
6.5

Supernode group of adjacent columns of L with
same nonzero structure
Related to clique structureof filled graph G(A)

Supernode-column update k sparse vector ops
become 1 dense triangular solve 1 dense
matrix vector 1 sparse vector add
Sparse BLAS 1 gt Dense BLAS 2
Only need row numbers for first column in each
supernode
For model problem, integer storage for L is O(n)
not O(n log n)

6
Nonsymmetric Supernodes
Original matrix A
7
Supernode-Panel Updates