BFS and DFS

About This Presentation

Title:

BFS and DFS

Description:

Theorem: The BRT supports INSERT and EXTRACT operations in O((1/B)log2(N/B)) and ... EXTRACT(v): Retrieve red edges from T. Remove these edges from P(v) using DELETE ... – PowerPoint PPT presentation

Number of Views:368

Avg rating:3.0/5.0

Slides: 78

Provided by: norbert70

Category:

more less

Transcript and Presenter's Notes

Title: BFS and DFS

1
BFS and DFS

BFS and DFS in directed graphs
BFS in undirected graphs
An improved undirected BFS-algorithm

2
The Buffered Repository Tree (BRT)

Stores key-value pairs (k,v)
Supported operations
INSERT(k,v) inserts a new pair (k,v) into T
EXTRACT(k) extracts all pairs with key k
Complexity
INSERT O((1/B)log2(N/B)) amortized
EXTRACT O(log2(N/B) K/B) amortized (K
number of reported elements)

3
The Buffered Repository Tree (BRT)

(2,4)-tree

Leaves store between B/4 and B elements

Internal nodes have buffers of size B

Root in main memory, rest on disk

4
INSERT(k,v)

O(X/B) I/Os to empty buffer of size X ? B
Amortized charge per element and level O(1/B)
Height of tree O(log2(N/B))
Insertion cost O((1/B)log2(N/B)) amortized

5
EXTRACT(k)

Number of traversed nodes O(log2(N/B) K/B)
I/Os per node O(1)
Cost of operation O(log2(N/B) K/B)
But careful with removal of extracted elements

Elements with key k
6
Cost of Rebalancing

O(N/B) leaf creations and deletions
O(N/B) node splits, fusions, merges
Each such operation costs O(1) I/Os
O(N/B) I/Os for rebalancing
Theorem The BRT supports INSERT and EXTRACT
operations in O((1/B)log2(N/B)) andO(log2(N/B)
K/B) I/Os amortized.

7
Directed DFS

Algorithm proceeds as internal memory algorithm
Use stack to determine order in which vertices
are visited
For current vertex v
Find unvisited out-neighbor w
Push w on the stack
Continue search at w
If no unvisited out-neighbor exists
Remove v from stack
Continue search at vs parent
Stack operations cost O(N/B) I/Os

Problem Finding an unvisited vertex

8
Directed DFS

Data structures
BRT T
Stores directed edges (v,w) with key v
Priority queues P(v), one per vertex
Stores unexplored out-edges of v
Invariant

Not in P(v) In P(v) and in T In P(v), but not in T
9
Directed DFS

Finding next vertex after vertex v

TotalO((V E/B)log2(E/B))
w
EXTRACT(v) Retrieve red edges from T
O(log2(E/B) K1/B)
O(V log2(E/B) E/B)
Remove these edges from P(v) using DELETE
O(sort(K1))
O(V sort(E))
Retrieve next edge using DELETEMIN on P(v)
O((1/B)logm(E/B))
O(sort(E))
Insert in-edges of w into T
O(1 (K2/B)log2(E/B))
O((E/B)log2(E/B))
Push w on the stack
O(1/B) amortized
O(V/B)
10
Directed DFS BFS

BFS can be solved using same algorithm
Only modification Use queue (FIFO) instead of
stack
Theorem Depth first-search and breadth-first
search in a directed graph G (V,E) can be
solved in O((VE/B)log2(E/B)) I/Os.
Exercise Convince yourself that the priority
queues P(v) are not necessary in the case of BFS.

11
Undirected BFS
Partition graph into levels L(0), L(1),
...around source L(0), L(1), L(2), L(3)

Observation For v ? L(i), all its neighbors are
inL(i 1) ? L(i) ? L(i 1).
Build BFS-tree level by level
Initially, L(0) r
Given levels L(i 1) and L(i)
Let X(i) set of all neighbors of vertices in
L(i)
Let L(i 1) X(i) \ (L(i 1) ? L(i))

12
Undirected BFS

Constructing L(i 1)
Retrieve adjacency lists of vertices in L(i) ?
X(i)
Sort X(i)
Scan L(i 1), L(i), and X(i) to
Remove duplicates from X(i)
Compute X(i) \ (L(i 1) ? L(i))
Complexity O(L(i) sort(L(i 1) X(i)))
I/Os

O( ) I/Os
V
sort(E)
Theorem Breadth-first search in an undirected
graph G (V,E) can be solved in O(V
sort(E)) I/Os.
13
A Faster BFS-Algorithm

Problem with simple BFS-algorithm
Random accesses to retrieve adjacency lists
Idea for a faster algorithm
Load more than one adjacency list at a time
Reduces number of random accesses
Causes edges to be involved in more than one
iteration of the algorithm
Trade-off

14
A Faster BFS-Algorithm (Randomized)

Let 0 lt m lt 1 be a parameter (specified later)
Two phases
Build mV disjoint clusters of diameter O(1/m)
Perform modified version of SIMPLEBFS
Clusters C1,...,Cq formed using BFS from randomly
chosen set V r1,...,rq of masters
Vertex is chosen as a master with probability
m(coin flip)
Observation EV mV. That is, the
expected number of clusters is mV.

15
Forming Clusters (Randomized)
s

Apply SIMPLEBFS to form clusters
L(0) V
v ? Ci if v is descendant of ri

16
Forming Clusters (Randomized)

Lemma The expected diameter of a cluster is 2/m.
Ek ? 1/m
Corollary The clusters are formed in expected
O((1/m)sort(E)) I/Os.

vk
s
v5
v4
v3
v2
v1
x
17
Forming Clusters (Randomized)

Form files F1,...,Fq, one per clusterFi
concatenation of adjacency lists of vertices in
Ci
Augment every edge (v,w) ? Fi with the start
position of file Fj s.t. w ? Cj
Edge triple (v,w,pj)

s
18
The BFS-Phase

Maintain a sorted pool H of edges s.t. adjacency
lists of vertices in L(i) are contained in H
Scan L(i) and H to find vertices in L(i) whose
adjacency lists are not in H
Form list of start positions of files containing
these adjacency lists and remove duplicates
Retrieve files, sort them, and merge resulting
list H with H
Scan L(i) and H to build X(i)
Construct L(i 1) from L(i 1), L(i), and X(i)
as before

O((L(i) H)/B)
O(sort(L(i)))
O(K sort(H) H/B)
O((L(i) H)/B)
O(sort(L(i) L(i1) X(i)))
19
The BFS-Phase

I/O-complexity of single step
O(K H/B sort(H L(i 1) L(i)
X(i)))
Expected I/O-complexityO(mV E/(mB)
sort(E))
Choose
Theorem BFS in an undirected graph G (V,E) can
be solved in
I/Os.

20
Single Source Shortest Paths

The tournament tree
SSSP in undirected graphs
SSSP in planar graphs

21
Single Source Shortest Paths

Need
I/O-efficient priority queue
I/O-efficient method to update only
unvisited vertices

22
The Tournament Tree

I/O-efficient priority queue
Supports
INSERT(x,p)
DELETE(x)
DELETEMIN
DECREASEKEY(x,p)
All operations take O((1/B)log2(N/B)) I/Os
amortized
Note N size of the universe ? elements in
the tree

23
The Tournament Tree

Static binary tree over all elements in the
universe

Elements map to leaves, M elements per leaf

Internal nodes store between M/2 and M elements

Internal nodes have signal buffers of size M

Root in main memory, rest on disk

24
The Tournament Tree

Elements stored at each node are sorted by
priority
Elements at node v have smaller priority than
elements at vs descendants
Convention x ? T if and only if p(x) is finite

25
The Tournament TreeDeletions

Operation DELETE(x) ? signal DELETE(x)

x
DELETE(x)
UPDATE(x,?)
26
The Tournament TreeInsertions and Updates

Operations INSERT(x,p) and DECREASEKEY(x,p)?
signal UPDATE(x,p)

All elements lt p
Forward signal to w
At least one element ? p
Insert x
Send DELETE(x) to w

Current priority p If p lt p Update If p ? p
Do nothing
27
The Tournament TreeHandling Overflow

Let y be element with highest priority py
Send signal PUSH(y,py) to appropriate child of v

y
28
The Tournament TreeKeeping the Nodes Filled
O(M/B) I/Os to move M/2 elements one level up the
tree
29
The Tournament TreeSignal Propagation

Scan vs signal, partition into sets Xu and Xw
Load u into memory, apply signals in Xu to
u,insert signals into us signal buffer
Do the same for w
O((X M)/B) O(X/B) I/Os

30
The Tournament TreeAnalysis

Elements travel up the tree
Cost O(1/B) I/Os amortized per element and level
O((K/B)log2(N/B)) I/Os for K operations
Signals travel down the tree
Cost O(1/B) I/Os amortized per signal and level
O(K) signals for K operations
O((K/B)log2(N/B)) I/Os
Theorem The tournament tree supports INSERT,
DELETE, DELETEMIN, and DECREASEKEY operations in
O((1/B)log2(N/B)) I/Os amortized.

31
Single Source Shortest Paths

Modified Dijkstra
Retrieve next vertex v from priority queue Q
using DELETEMIN
Retrieve vs adjacency list
Update distances of all of vs neighbors, except
predecessor u on the path from s to v
Repeat
O(V (E/B)log2(V/B)) I/Os using tournament tree

32
Single Source Shortest Paths

Problem
Observation If v performs a spurious update of
u,u has tried to update v before.
Record this update attempt of u on v by
insterting u into another priority queue
QPriority d(s,u) w(u,v)

u
v
33
Single Source Shortest Paths

Second modification
Retrieve next vertex using two DELETEMINs,one
on Q, one on Q
Let (x,px) be the element retrieved from Q,let
(y,py) be the element retrieved from Q
If px ? py re-insert (y,py) into Q and proceed
as normal
If px lt py re-insert (x,px) into Q and perform a
DELETE(y) on Q

34
Single Source Shortest Paths

Lemma A spurious update is removed from Q before
the targeted vertex can be retrieved using
DELETEMIN.
Event A Spurious update happens (time d(s,v))
Event B Vertex u is deleted by retrieval of u
from Q (time d(s,u) w(e))
Event C Vertex u is retrieved from Q using
DELETEMIN operation (time d(s,v) w(e))

u
v
35
Single Source Shortest Paths

Assume that all vertices have different distance
from source s
d(u) lt d(v)
d(v) ? d(u) w(e) lt d(u) w(e)
Sequence of events A ? B ? C
Theorem The single source shortest path problem
on an undirected graph G (V,E) can be solved
inO(V (E/B)log2(V/B)) I/Os.

36
Planar Graphs

Shortest paths in planar graphs
Planar separators
Planar DFS

37
Shortest Paths in Planar Graphs
s
38
Shortest Paths in Planar Graphs

Observation For every separator vertex v, the
distances from s to v in G and GR are the same.
The distances from s to all separator vertices
can be computed in GR.

v
s
s
v
39
Shortest Paths in Planar Graphs

Observation For every vertex v in Gi,dist(s,v)
mindist(s,x) dist(x,v) v ? ?Gi.
Can compute dist(s,v) in the following graph

v
s
40
Shortest Paths in Planar Graphs

Three main steps
Solve all-pairs shortest paths in subgraphs Gi
Compute shortest paths from s to separator
vertices in GR
Compute shortest paths from s to all remaining
vertices

41
Shortest Paths in Planar Graphs

Regular h-partition
O(N/h) subgraphs G1,...,Gr
Each Gi has size at most h
Each Gi has boundary size at most
Total number of separator vertices
Number of boundary sets is O(N/h)

42
Shortest Paths in Planar Graphs

Three main steps
Solve all-pairs shortest paths in subgraphs Gi
Compute shortest paths from s to separator
vertices in GR
Compute shortest paths from s to all remaining
vertices
Assume the given partition is regular
B2-partition
Steps 1 and 3 take O(scan(N)) I/Os
Graph GR has O(N/B) vertices and O(N) edges

43
Shortest Paths in Planar Graphs

Data structures
List L storing tentative distances of all
vertices
Priority queue Q storing vertices with their
tentative distances as priorities
One step
Retrieve next vertex v using DELETEMIN
Get distances of vs neighbors from L
Update their distances in Q using DELETE and
INSERT
O(N sort(N)) I/Os

44
Shortest Paths in Planar Graphs

One I/O per boundary set
Each boundary set is touched O(B) times
Once per vertex on the boundary of the region
O(N/B2) boundary sets ? O(N/B) I/Os

45
Planar Separator

Goal Compute a separator S of size
whose removal partitions G into subgraphs of size
at most h.
Basic idea
Compute hierarchy of log(DB) graphs of
geometrically decreasing size using graph
contraction
Compute a separator of the smallest graph
Undo the contractions and maintain the separator
while doing this
Assumption M W(h log2 B)

46
Planar Separator
47
Planar Separator

Properties
All Gi are planar
Gi1 ? Gi/2
Every vertex in Gi1 represents only a constant
number of vertices in Gi
Every vertex in Gi1 represents at most 2i2
vertices in G0
r log2(DB) graphs G0,,Gr
Gr O(N/(DB))

48
Planar Separator
49
Planar Separator

Compute separator Sr of Gr
Sr Sr? partitions Gr into connected components
of size at most h?log2(DB)
Takes O(Gr) O(N/B) I/Os AD96

50
Planar Separator

Compute Si from Si1
Let S?i be the set of vertices in Gi represented
by the vertices in Si1
Connected components of Gi S?i have size at
most c?h?log2(DB)
Partition every connected components of size more
than h?log2(DB) into components of size
h?log2(DB) ? separator Si?
Takes O(sort(Gi)) I/Os
Connected components O(sort(Gi))
Partitioning happens in internal memory
Total O(sort(N)) I/Os

51
Planar Separator

Separator S0 partitions G0 into connected
components of size at most h?log2(DB)
Size of S0

52
Planar Separator

Compute a superset S of S0 so that no connected
component of G S has size more than h
Partition every connected component of G S0
separately in internal memory
Total number of extra separator vertices is
Extra cost O(sort(N)) I/Os

53
Building the Graph Hierarchy

Properties
All Gi are planar
Gi1 ? Gi/2
Every vertex in Gi1 represents only a constant
number of vertices in Gi
Every vertex in Gi1 represents at most 2i2
vertices in G0

Build Gi1 from Gi by
Contracting edges
Merging vertices of degree 2 with the same
neighbors

54
Building the Graph Hierarchy

Iterative approach
Extract set of edges that can be contracted
Contract subset of these edges to reduce number
of vertices by a factor of two
Repeat until no contractible edges remain

Problem
Standard graph contraction procedure may contract
too many vertices into a single vertex.

55
Building the Graph Hierarchy

Solution
Compute maximal matching of contractible subgraph
Contract edges in the matching

New problem
We may not contract sufficient number of edges to
reduce number of vertices by a constant factor

Two-stage contraction
Contract maximal matching
Contract edges between matched and unmatched
vertices

56
Building the Graph Hierarchy

Why is this two-stage approach good?
No unmatched vertex remains in contractible
subgraph
Every matched vertex represents at least two
vertices before the contraction
Size of graph reduces by a factor of two
If a single iteration takes O(sort(Gi)) I/Os,
the whole construction of Gi1 from Gi
takesO(sort(Gi)) I/Os

57
A Single Contraction Phase

Maximal matching can be computed and contracted
in O(sort(H)) I/Os, where H is the current
contractible subgraph
Bipartite contraction
Takes O(sort(H)) I/Os using buffer tree as
priority queue

58
Building the Graph Hierarchy

Lemma Graph Gi1 can be constructed from Gi in
O(sort(Gi)) I/Os.
Corollary The whole graph hierarchy can be built
in O(sort(G0)) O(sort(N)) I/Os.

59
Planar DFS
60
Planar DFS
s
61
Planar DFS

Observation Every cycle in the i-th layer is a
boundary cycle of graph Gi.

Every bicomp of a layer is a cycle.

62
DFS in a Layer
63
Planar DFS

DFS in a single layer Hi takes O(sort(Hi))
I/Os
Compute the bicomps
Root the bicomp tree
Remove one of the edges incident to parent
cutpoint in each cycle
Total I/O-complexity O(sort(N))

64
Planar DFS
65
Planar DFS
66
Building the Face-on-Vertex Graph
67
Lower Bounds and Open Problems

Lower bounds
List ranking, BFS, DFS, and shortest paths
Connected and biconnected components
Open problems

68
Lower BoundsSplit Proximate Neighbors
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
69
Lower BoundsSplit Proximate Neighbors

Lemma Split proximate neighbors requires
W(perm(N)) I/Os.

Total O(I(N) scan(N)) O(I(N))
I(N) W(perm(N))

70
Lower BoundsList Ranking

Consider general algorithms for weighted list
ranking
Algorithm is only allowed to use associativity of
sum operator
Algorithm can be made to have the following
property
For every vertex v, v and succ(v) are both in
main memory at some point during the course of
the algorithm
Note The lower bound we show does not hold for
unweighted list ranking or weighted list ranking
over groups.

71
Lower BoundsList Ranking

When both copies of x are in main memory, move to
buffer of size B
When buffer full, flush to disk
Split proximate neighbors could be solved
inO(I(N) scan(N)) I/Os
I(N) W(perm(N))

72
Lower BoundsList Ranking, BFS, DFS, and Shortest
Paths

Theorem List ranking requires W(perm(N)) I/Os.
List ranking can be solved using BFS, DFS, or
SSSP from the head of the list.
Theorem BFS, DFS, and SSSP require W(perm(N))
I/Os.
Note Again, lower bound holds only for
algorithms that compute distances from source
only by adding path lengths.

73
Lower BoundsSegmented Duplicate Elimination

Let P ? N ? P2
Elements drawn from interval 2P1,3P
Construct Boolean array C2P1..3P s.t.Ci 1
iff i ? S
Proposition Segmented duplicate elimination
requires W(perm(N)) I/Os.

S
17
18
19
20
22
23
19
19
20
20
22
20
18
23
17
19
P/2
P/2
P/2
P/2
74
Lower BoundsConnected Components
17
18
19
20
22
23
19
19
20
20
22
20
18
23
17
19
S1
S2
S3
S4
17
1
18
19
2
20
21
3
22
23
4
24