Daniel A. Spielman - PowerPoint PPT Presentation

About This Presentation

Title:

Daniel A. Spielman

Description:

If then Graph Decomposition Exists Proof: Let S be largest set s.t. If then S Graph Decomposition Exists Proof: Let S be largest set s.t. If ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 102

Provided by: Danie164

Learn more at: http://www.cs.yale.edu

Category:

more less

Transcript and Presenter's Notes

Title: Daniel A. Spielman

1
Fast, Randomized Algorithms for Partitioning,
Sparsification, and the Solution of Linear
Systems

Daniel A. Spielman
MIT

Joint work with Shang-Hua Teng (Boston University)
2
Papers
Nearly-Linear Time Algorithms for Graph
Partitioning, Graph Sparsification, and Solving
Linear Systems
S-Teng 04
The Mixing Rate of Markov Chains, An
Isoperimetric Inequality, and Computing The
Volume
Lovasz-Simonovits 93
The Eigenvalues of Random Symmetric Matrices

Füredi-Komlós 81
3
Overview
Preconditioning to solve linear system
Find B that approximates A, such that solving
is easy
Three techniques Augmented Spanning Trees
Vaidya 90 Sparsification
Approximate graph by sparse graph
Partitioning Runtime proportional to
nodes removed

4
Outline

Linear system solvers
Sparsification using partitioning and
random sampling
Graph partitioning by truncated random walks

Combinatorial and Spectral Graph Theory
Eigenvalues of random graphs

5
Diagonally Dominant Matrices
Solve where diags of A at least
sum of abs of others in row
6
Complexity of Solving Ax b A positive
semi-definite
General Direct Methods
Gaussian Elimination (Cholesky)
Fast matrix inversion
Conjugate Gradient
n is dimension m is number non-zeros
7
Complexity of Solving Ax b A positive
semi-definite, structured
Express A L LT
Path forward and backward elimination
Tree like path, work up from leaves
Planar Nested dissection Lipton-Rose-Tarjan
79
8
Iterative Methods
Preconditioned Conjugate Gradient
Find easy B that approximates A
Solve in time
Quality of approximation
Time to solve By c
9
Main Result
For symmetric, diagonally dominant A
Iteratively solve in time
General
Planar
10
Vaidyas Subgraph Preconditioners
Precondition A by the Laplacian of a subgraph, B
A
B
11
History
Vaidya 90 Relate to graph embedding
cong/dil use MST augmented MST
planar
Gremban, Miller 96 Steiner vertices, many
details
planar
Joshi 97, Reif 98 Recursive, multi-level
approach
Bern, Boman, Chen, Gilbert, Hendrickson, Nguyen,
Toledo All the details, algebraic approach
12
History
Maggs, Miller, Parekh, Ravi, Woo 02 after
preprocessing
Boman, Hendrickson 01 Low-stretch spanning
trees
S-Teng 03 Augmented low-stretch trees
Clustering, Partitioning, Sparsification,
Recursion Lower-stretch spanning trees
13
The relative condition number
if A is positive semi-definite
if A-B is positive semi-definite, that is,
For A Laplacian of graph with edges E
14
Bounding
For B subgraph of A,
and so
15
Fundamental Inequality
1
8
16
Fundamental Inequality
17
Application to eigenvalues of graphs
0 for Laplacian matrices, and
Example For complete graph on n nodes,
all non-zero eigs n
For path,
-7
-5
-3
-1
1
3
5
7
18
Lower bound on
Gauttery-Leighton-Miller 97
19
Lower bound on
So
And
20
Preconditioning with a Spanning Tree B
A
B
Every edge of A not in B has unique path in B
21
When B is a Tree
22
Low-Stretch Spanning Trees
Theorem (Boman-Hendrickson 01)
where
Theorem (Alon-Karp-Peleg-West 91)
Theorem (Elkin-Emek-S-Teng 04)
23
Vaidyas Augmented Spanning Trees
B a spanning tree plus s edges, in
total
24
Adding Edges to a Tree
25
Adding Edges to a Tree
Partition tree into t sub-trees, balancing
stretch
26
Adding Edges to a Tree
Partition tree into t sub-trees, balancing
stretch For sub-trees connected in A, add one
such bridge edge, carefully
(and in blue)
27
Adding Edges to a Tree
Theorem
t sub-trees
in general
if planar
28
Sparsification Feder-Motwani 91, Benczur-Karger
96 D. Eppstein, Z. Galil, G.F. Italiano, and T.
Spencer. 93 D. Eppstein, Z. Galil, G.F.
Italiano, and A. Nissenzweig 97
All graphs can be well-approximated by a sparse
graph
29
Sparsification
Benczur-Karger 96 Can find sparse
subgraph H s.t.
(edges are subset, but different weights)
We need
H has edges
30
Example Complete Graph
If A is Laplacian of Kn, all non-zero
eigenvalues are n
If B is Laplacian of Ramanujan expander all
non-zero eigenvalues satisfy
And so
31
Example Dumbell
A
Kn
If B does not contain middle edge,
32
Example Grid plus edge
(m-1)2
1
1
1
1

Random sampling not sufficient.
Cut approximation not sufficient

(m-1)2 k(m-1)
33
Conductance
Cut partition of vertices
S
Conductance of S
Conductance of G
34
Conductance
S
S
S
35
Conductance and Sparsification
If conductance high (expander) can
precondition by random sampling If conductance
low can partition graph by removing few
edges Decomposition Partition of vertex set
remove few edges graph on each
partition has high conductance
36
Graph Decomposition
Lemma Exists partition of vertices
Each Vi has large
At most half the edges cross partition
sample these edges
recurse on these edges
Alg
37
Graph Decomposition Exists
Each Vi has At most half edges cross
Proof Let S be largest set s.t.
If
then
38
Graph Decomposition Exists
Proof Let S be largest set s.t.
If
then
S
39
Graph Decomposition Exists
Proof Let S be largest set s.t.
If
then
If S big, V-S not too big
If S small only recurse in S
S
S
V - S
Bounded recursion depth
40
Sparsification from Graph Decomposition
sample these edges
Alg
recurse on these edges
Theorem Exists B with edges lt
Need to find the partition in nearly-linear time
41
Sparsification
Thm Given G, find subgraph H s.t.
edges(H) lt
in time
Thm For all A, find B edges
General solve time
42
Cheegers Inequality
Sinclair-Jerrum 89
For Laplacian L, and D diagonal matrix with
degree of node i
43
Random Sampling
Given a graph will randomly sample to get
So that is
small, where is diagonal matrix of
degrees in
Useful if is big
44
Useful if is big
If
and
Then, for all x orthogonal to (mutual) nullspace
45
But, dont want D in there
D has no impact
So
implies
46
Work with adjacency matrix
weight of edge from i to j, 0 if i j
No difference, because
47
Random Sampling Rule
Choose param governing sparsity of
Keep edge (i,j) with prob
If keep edge, raise weight by
48
Random Sampling Rule
guarantees
And,
guarantees expect at most edges in
49
Random Sampling Theorem
Theorem
Useful for
Will prove in unweighted case.
From now on, all edges have weight 1
50
Analysis by Trace
Trace sum of diagonal entries sum
of eigenvalues
For even k
51
Analysis by Trace
Main Lemma
Proof of Theorem
Markov
k-th root
52
Expected Trace
3
?2,3
?1,2
?3,4
2
1
?2,1
?4,2
4
53
Expected Trace
Most terms zero because
So, sum is zero unless each edge appears at
least twice. Will code such walks to count them.
54
Coding walks
For sequence
where S i edge not used
before, either way
the index of the time edge was
used before, either way
55
Coding walks Example
step 0, 1, 2, 3, 4, 5, 6, 7, 8 vert 1, 2, 3,
4, 2, 3, 4, 2, 1
S 1, 2, 3, 4
? 1 ? 2 2 ? 3 3 ? 4 4 ? 2

5 ? 2
6 ? 3
7 ? 4
8 ? 1

Now, do a few slides pointing out what it meas
56
Coding walks Example
step 0, 1, 2, 3, 4, 5, 6, 7, 8 vert 1, 2, 3,
4, 2, 3, 4, 2, 1
S 1, 2, 3, 4
? 1 ? 2 2 ? 3 3 ? 4 4 ? 2

5 ? 2
6 ? 3
7 ? 4
8 ? 1

Now, do a few slides pointing out what it meas
57
Coding walks Example
step 0, 1, 2, 3, 4, 5, 6, 7, 8 vert 1, 2, 3,
4, 2, 3, 4, 2, 1
S 1, 2, 3, 4
? 1 ? 2 2 ? 3 3 ? 4 4 ? 2

5 ? 2
6 ? 3
7 ? 4
8 ? 1

Now, do a few slides pointing out what it meas
58
Coding walks Example
step 0, 1, 2, 3, 4, 5, 6, 7, 8 vert 1, 2, 3,
4, 2, 3, 4, 2, 1
S 1, 2, 3, 4
? 1 ? 2 2 ? 3 3 ? 4 4 ? 2

5 ? 2
6 ? 3
7 ? 4
8 ? 1

Now, do a few slides pointing out what it meas
59
Coding walks Example
step 0, 1, 2, 3, 4, 5, 6, 7, 8 vert 1, 2, 3,
4, 2, 3, 4, 2, 1
S 1, 2, 3, 4

5 ? 2
6 ? 3
7 ? 4
8 ? 1

? 1 ? 2 2 ? 3 3 ? 4 4 ? 2
Now, do a few slides pointing out what it meas
60
Valid ?s
For each i in S is a neighbor
of with a probability of being chosen lt
1 That is,
can be non-zero
For each i not in S can take edge
indicated by That is
61
Expected Trace by from Code
62
Expected Trace from Code
Are ways to choose given
63
Expected Trace from Code
Are ways to choose given
64
Random Sampling Theorem
Theorem
Useful for
Random sampling sparsifies graphs of high
conductance
65
Graph Partitioning Algorithms
SDP/LP too slow Spectral one cut quickly,
but can be unbalanced many
runs Multilevel (Chaco/Metis) cant analyze,
miss small
sparse cuts New Alg Based on truncated random
walks. Approximates optimal balance,
Can be use to decompose
Run time
66
Lazy Random Walk
At each step stay put with prob ½
otherwise, move to neighbor according to weight
Diffusing probability mass keep ½ for
self, distribute rest among neighbors
67
Lazy Random Walk
Diffusing probability mass keep ½ for
self, distribute rest among neighbors
2
Or, with self-loops, distribute evenly over
edges
0
1
0
1
3
0
2
68
Lazy Random Walk
Diffusing probability mass keep ½ for
self, distribute rest among neighbors
2
Or, with self-loops, distribute evenly over
edges
0
1/2
1/2
1
3
0
2
69
Lazy Random Walk
Diffusing probability mass keep ½ for
self, distribute rest among neighbors
2
Or, with self-loops, distribute evenly over
edges
1/12
1/3
1/2
1
3
1/12
2
70
Lazy Random Walk
Diffusing probability mass keep ½ for
self, distribute rest among neighbors
2
Or, with self-loops, distribute evenly over
edges
7/48
1/4
11/24
1
3
7/48
2
71
Lazy Random Walk
Diffusing probability mass keep ½ for
self, distribute rest among neighbors
2
2/8
1/8
3/8
1
3
In limit, prob mass degree
2/8
2
72
Why lazy?
Otherwise, might be no limit
0
1
1
0
73
Why so lazy to keep ½ mass?
Diagonally dominant matrices
weight of edge from i to j
degree of node i, if i j
74
Rate of convergence
Cheegers inequality related to conductance
If low conductance, slow convergence
If start uniform on S, prob leave at each step
After steps, at least ¾ of mass
still in S
75
Lovasz-Simonovits Theorem
If slow convergence, then low conductance And,
can find the cut from highest probability nodes
.137
.135
.094
.134
.112
.129
76
Lovasz-Simonovits Theorem
From now on, every node has same degree, d
For all vertex sets S, and all t T
Where
k verts with most prob at step t
77
Lovasz-Simonovits Theorem
If start from node in set S with small
conductance will output a set of small
conductance
Extension mostly contained in S
78
Speed of Lovasz-Simonovits
If want cut of conductance can run for
steps.
Want run-time proportional to vertices
removed Local clustering on massive graph
Problem most vertices can have non-zero mass
when cut is found
79
Speeding up Lovasz-Simonovits
Round all small entries of to zero
Algorithm Nibble input start vertex v,
target set size k start step
all values lt map to
zero
Theorem if v in conductance lt
size k set output a set with conductance lt
mostly overlapping
80
The Potential Function
For integer k
Linear in between these points
81
Concave slopes decrease
For
82
Easy Inequality
83
Fancy Inequality
84
Fancy Inequality
85
Chords with little progress
86
Dominating curve, makes progress
87
Dominating curve, makes progress
88
Proof of Easy Inequality
Order vertices by probability mass
time t-1
time t
If top k at time t-1 only connect to top k at
time t get equality
89
Proof of Easy Inequality
Order vertices by probability mass
time t-1
time t
If top k at time t-1 only connect to top k at
time t get equality
Otherwise, some mass leaves, and get inequality
90
External edges from Self Loops
Lemma For every set S, and every set R of same
size At least frac of
edges from S dont hit R
Tight example
3
3
3
3
?(S) 1/2
3
3
S
R
91
External edges from Self Loops
Lemma For every set S, and every set R of same
size At least ?(S) frac of edges
from S dont hit R
Proof
When RS, is by definition. Each edge in R-S can
absorb d edges from S. But each vertex of S-R has
d self-loops that do not go to R.
S
R
92
Proof of Fancy Inequality
time t-1
top
out
in
time t
It(k) top in top (in out)/2
top/2 (top in out)/2

93
Local Clustering
Theorem If S is set of conductance lt
v is random vertex of S Then output set of
conductance lt mostly in S, in
time proportional to size of output.
94
Local Clustering
Theorem If S is set of conductance lt
v is random vertex of S Then output set of
conductance lt mostly in S, in
time proportional to size of output.
Can it be done for all conductances???
95
Experimental code ClusTree
1. Cluster, crudely
2. Make trees in clusters
3. Add edges between trees, optimally
No recursion on reduced matrices
96
Implementation
ClusTree in java timing not included
could increase total time by 20 PCG Cholesky dr
optol Inc Chol Vaidya
TAUCS Chen, Rotkin, Toledo
Orderings amd, genmmd, Metis, RCM
Intel Xeon 3.06GHz, 512k L2 Cache, 1M L3 Cache
97
2D grid, Neumann bdry
run to residual error 10-8
98
Impact of boundary conditions
2d Grid Dirichlet
2D grid, Neumann
99
2D Unstructured Delaunay Mesh
Dirichlet
Neumann
100
Future Work
Practical Local Clustering More Sparsification
Other physical problems Cheegers
inequalty Implications for combinatorics, and
Spectral Hypergraph Theory
101
To learn more
Nearly-Linear Time Algorithms for Graph
Paritioning, Sparsification, and Solving Linear
Systems (STOC 04, Arxiv)
Will be split into two papers numerical and
combinatorial
My lecture notes for Spectral Graph Theory and
its Applications

Write a Comment

User Comments (0)