BFS and DFS - PowerPoint PPT Presentation

1 / 77
About This Presentation
Title:

BFS and DFS

Description:

Theorem: The BRT supports INSERT and EXTRACT operations in O((1/B)log2(N/B)) and ... EXTRACT(v): Retrieve red edges from T. Remove these edges from P(v) using DELETE ... – PowerPoint PPT presentation

Number of Views:368
Avg rating:3.0/5.0
Slides: 78
Provided by: norbert70
Category:
Tags: bfs | dfs | extract

less

Transcript and Presenter's Notes

Title: BFS and DFS


1
BFS and DFS
  • BFS and DFS in directed graphs
  • BFS in undirected graphs
  • An improved undirected BFS-algorithm

2
The Buffered Repository Tree (BRT)
  • Stores key-value pairs (k,v)
  • Supported operations
  • INSERT(k,v) inserts a new pair (k,v) into T
  • EXTRACT(k) extracts all pairs with key k
  • Complexity
  • INSERT O((1/B)log2(N/B)) amortized
  • EXTRACT O(log2(N/B) K/B) amortized (K
    number of reported elements)

3
The Buffered Repository Tree (BRT)
  • (2,4)-tree
  • Leaves store between B/4 and B elements
  • Internal nodes have buffers of size B
  • Root in main memory, rest on disk

4
INSERT(k,v)
  • O(X/B) I/Os to empty buffer of size X ? B
  • Amortized charge per element and level O(1/B)
  • Height of tree O(log2(N/B))
  • Insertion cost O((1/B)log2(N/B)) amortized

5
EXTRACT(k)
  • Number of traversed nodes O(log2(N/B) K/B)
  • I/Os per node O(1)
  • Cost of operation O(log2(N/B) K/B)
  • But careful with removal of extracted elements

Elements with key k
6
Cost of Rebalancing
  • O(N/B) leaf creations and deletions
  • O(N/B) node splits, fusions, merges
  • Each such operation costs O(1) I/Os
  • O(N/B) I/Os for rebalancing
  • Theorem The BRT supports INSERT and EXTRACT
    operations in O((1/B)log2(N/B)) andO(log2(N/B)
    K/B) I/Os amortized.

7
Directed DFS
  • Algorithm proceeds as internal memory algorithm
  • Use stack to determine order in which vertices
    are visited
  • For current vertex v
  • Find unvisited out-neighbor w
  • Push w on the stack
  • Continue search at w
  • If no unvisited out-neighbor exists
  • Remove v from stack
  • Continue search at vs parent
  • Stack operations cost O(N/B) I/Os
  • Problem Finding an unvisited vertex

8
Directed DFS
  • Data structures
  • BRT T
  • Stores directed edges (v,w) with key v
  • Priority queues P(v), one per vertex
  • Stores unexplored out-edges of v
  • Invariant

Not in P(v) In P(v) and in T In P(v), but not in T
9
Directed DFS
  • Finding next vertex after vertex v

TotalO((V E/B)log2(E/B))
w
EXTRACT(v) Retrieve red edges from T
O(log2(E/B) K1/B)
O(V log2(E/B) E/B)
Remove these edges from P(v) using DELETE
O(sort(K1))
O(V sort(E))
Retrieve next edge using DELETEMIN on P(v)
O((1/B)logm(E/B))
O(sort(E))
Insert in-edges of w into T
O(1 (K2/B)log2(E/B))
O((E/B)log2(E/B))
Push w on the stack
O(1/B) amortized
O(V/B)
10
Directed DFS BFS
  • BFS can be solved using same algorithm
  • Only modification Use queue (FIFO) instead of
    stack
  • Theorem Depth first-search and breadth-first
    search in a directed graph G (V,E) can be
    solved in O((VE/B)log2(E/B)) I/Os.
  • Exercise Convince yourself that the priority
    queues P(v) are not necessary in the case of BFS.

11
Undirected BFS
Partition graph into levels L(0), L(1),
...around source L(0), L(1), L(2), L(3)
  • Observation For v ? L(i), all its neighbors are
    inL(i 1) ? L(i) ? L(i 1).
  • Build BFS-tree level by level
  • Initially, L(0) r
  • Given levels L(i 1) and L(i)
  • Let X(i) set of all neighbors of vertices in
    L(i)
  • Let L(i 1) X(i) \ (L(i 1) ? L(i))

12
Undirected BFS
  • Constructing L(i 1)
  • Retrieve adjacency lists of vertices in L(i) ?
    X(i)
  • Sort X(i)
  • Scan L(i 1), L(i), and X(i) to
  • Remove duplicates from X(i)
  • Compute X(i) \ (L(i 1) ? L(i))
  • Complexity O(L(i) sort(L(i 1) X(i)))
    I/Os

O( ) I/Os
V
sort(E)
Theorem Breadth-first search in an undirected
graph G (V,E) can be solved in O(V
sort(E)) I/Os.
13
A Faster BFS-Algorithm
  • Problem with simple BFS-algorithm
  • Random accesses to retrieve adjacency lists
  • Idea for a faster algorithm
  • Load more than one adjacency list at a time
  • Reduces number of random accesses
  • Causes edges to be involved in more than one
    iteration of the algorithm
  • Trade-off

14
A Faster BFS-Algorithm (Randomized)
  • Let 0 lt m lt 1 be a parameter (specified later)
  • Two phases
  • Build mV disjoint clusters of diameter O(1/m)
  • Perform modified version of SIMPLEBFS
  • Clusters C1,...,Cq formed using BFS from randomly
    chosen set V r1,...,rq of masters
  • Vertex is chosen as a master with probability
    m(coin flip)
  • Observation EV mV. That is, the
    expected number of clusters is mV.

15
Forming Clusters (Randomized)
s
  • Apply SIMPLEBFS to form clusters
  • L(0) V
  • v ? Ci if v is descendant of ri

16
Forming Clusters (Randomized)
  • Lemma The expected diameter of a cluster is 2/m.
  • Ek ? 1/m
  • Corollary The clusters are formed in expected
    O((1/m)sort(E)) I/Os.

vk
s
v5
v4
v3
v2
v1
x
17
Forming Clusters (Randomized)
  • Form files F1,...,Fq, one per clusterFi
    concatenation of adjacency lists of vertices in
    Ci
  • Augment every edge (v,w) ? Fi with the start
    position of file Fj s.t. w ? Cj
  • Edge triple (v,w,pj)

s
18
The BFS-Phase
  • Maintain a sorted pool H of edges s.t. adjacency
    lists of vertices in L(i) are contained in H
  • Scan L(i) and H to find vertices in L(i) whose
    adjacency lists are not in H
  • Form list of start positions of files containing
    these adjacency lists and remove duplicates
  • Retrieve files, sort them, and merge resulting
    list H with H
  • Scan L(i) and H to build X(i)
  • Construct L(i 1) from L(i 1), L(i), and X(i)
    as before

O((L(i) H)/B)
O(sort(L(i)))
O(K sort(H) H/B)
O((L(i) H)/B)
O(sort(L(i) L(i1) X(i)))
19
The BFS-Phase
  • I/O-complexity of single step
  • O(K H/B sort(H L(i 1) L(i)
    X(i)))
  • Expected I/O-complexityO(mV E/(mB)
    sort(E))
  • Choose
  • Theorem BFS in an undirected graph G (V,E) can
  • be solved in
    I/Os.

20
Single Source Shortest Paths
  • The tournament tree
  • SSSP in undirected graphs
  • SSSP in planar graphs

21
Single Source Shortest Paths
  • Need
  • I/O-efficient priority queue
  • I/O-efficient method to update only
    unvisited vertices

22
The Tournament Tree
  • I/O-efficient priority queue
  • Supports
  • INSERT(x,p)
  • DELETE(x)
  • DELETEMIN
  • DECREASEKEY(x,p)
  • All operations take O((1/B)log2(N/B)) I/Os
    amortized
  • Note N size of the universe ? elements in
    the tree

23
The Tournament Tree
  • Static binary tree over all elements in the
    universe
  • Elements map to leaves, M elements per leaf
  • Internal nodes store between M/2 and M elements
  • Internal nodes have signal buffers of size M
  • Root in main memory, rest on disk

24
The Tournament Tree
  • Elements stored at each node are sorted by
    priority
  • Elements at node v have smaller priority than
    elements at vs descendants
  • Convention x ? T if and only if p(x) is finite

25
The Tournament TreeDeletions
  • Operation DELETE(x) ? signal DELETE(x)

x
DELETE(x)
UPDATE(x,?)
26
The Tournament TreeInsertions and Updates
  • Operations INSERT(x,p) and DECREASEKEY(x,p)?
    signal UPDATE(x,p)

x
  • All elements lt p
  • Forward signal to w
  • At least one element ? p
  • Insert x
  • Send DELETE(x) to w

Current priority p If p lt p Update If p ? p
Do nothing
27
The Tournament TreeHandling Overflow
  • Let y be element with highest priority py
  • Send signal PUSH(y,py) to appropriate child of v

y
28
The Tournament TreeKeeping the Nodes Filled
O(M/B) I/Os to move M/2 elements one level up the
tree
29
The Tournament TreeSignal Propagation
  • Scan vs signal, partition into sets Xu and Xw
  • Load u into memory, apply signals in Xu to
    u,insert signals into us signal buffer
  • Do the same for w
  • O((X M)/B) O(X/B) I/Os

30
The Tournament TreeAnalysis
  • Elements travel up the tree
  • Cost O(1/B) I/Os amortized per element and level
  • O((K/B)log2(N/B)) I/Os for K operations
  • Signals travel down the tree
  • Cost O(1/B) I/Os amortized per signal and level
  • O(K) signals for K operations
  • O((K/B)log2(N/B)) I/Os
  • Theorem The tournament tree supports INSERT,
    DELETE, DELETEMIN, and DECREASEKEY operations in
    O((1/B)log2(N/B)) I/Os amortized.

31
Single Source Shortest Paths
  • Modified Dijkstra
  • Retrieve next vertex v from priority queue Q
    using DELETEMIN
  • Retrieve vs adjacency list
  • Update distances of all of vs neighbors, except
    predecessor u on the path from s to v
  • Repeat
  • O(V (E/B)log2(V/B)) I/Os using tournament tree

32
Single Source Shortest Paths
  • Problem
  • Observation If v performs a spurious update of
    u,u has tried to update v before.
  • Record this update attempt of u on v by
    insterting u into another priority queue
    QPriority d(s,u) w(u,v)

u
v
33
Single Source Shortest Paths
  • Second modification
  • Retrieve next vertex using two DELETEMINs,one
    on Q, one on Q
  • Let (x,px) be the element retrieved from Q,let
    (y,py) be the element retrieved from Q
  • If px ? py re-insert (y,py) into Q and proceed
    as normal
  • If px lt py re-insert (x,px) into Q and perform a
    DELETE(y) on Q

34
Single Source Shortest Paths
  • Lemma A spurious update is removed from Q before
    the targeted vertex can be retrieved using
    DELETEMIN.
  • Event A Spurious update happens (time d(s,v))
  • Event B Vertex u is deleted by retrieval of u
    from Q (time d(s,u) w(e))
  • Event C Vertex u is retrieved from Q using
    DELETEMIN operation (time d(s,v) w(e))

u
v
35
Single Source Shortest Paths
  • Assume that all vertices have different distance
    from source s
  • d(u) lt d(v)
  • d(v) ? d(u) w(e) lt d(u) w(e)
  • Sequence of events A ? B ? C
  • Theorem The single source shortest path problem
    on an undirected graph G (V,E) can be solved
    inO(V (E/B)log2(V/B)) I/Os.

36
Planar Graphs
  • Shortest paths in planar graphs
  • Planar separators
  • Planar DFS

37
Shortest Paths in Planar Graphs
s
38
Shortest Paths in Planar Graphs
  • Observation For every separator vertex v, the
    distances from s to v in G and GR are the same.
  • The distances from s to all separator vertices
    can be computed in GR.

v
s
s
v
39
Shortest Paths in Planar Graphs
  • Observation For every vertex v in Gi,dist(s,v)
    mindist(s,x) dist(x,v) v ? ?Gi.
  • Can compute dist(s,v) in the following graph

v
s
40
Shortest Paths in Planar Graphs
  • Three main steps
  • Solve all-pairs shortest paths in subgraphs Gi
  • Compute shortest paths from s to separator
    vertices in GR
  • Compute shortest paths from s to all remaining
    vertices

41
Shortest Paths in Planar Graphs
  • Regular h-partition
  • O(N/h) subgraphs G1,...,Gr
  • Each Gi has size at most h
  • Each Gi has boundary size at most
  • Total number of separator vertices
  • Number of boundary sets is O(N/h)

42
Shortest Paths in Planar Graphs
  • Three main steps
  • Solve all-pairs shortest paths in subgraphs Gi
  • Compute shortest paths from s to separator
    vertices in GR
  • Compute shortest paths from s to all remaining
    vertices
  • Assume the given partition is regular
    B2-partition
  • Steps 1 and 3 take O(scan(N)) I/Os
  • Graph GR has O(N/B) vertices and O(N) edges

43
Shortest Paths in Planar Graphs
  • Data structures
  • List L storing tentative distances of all
    vertices
  • Priority queue Q storing vertices with their
    tentative distances as priorities
  • One step
  • Retrieve next vertex v using DELETEMIN
  • Get distances of vs neighbors from L
  • Update their distances in Q using DELETE and
    INSERT
  • O(N sort(N)) I/Os

44
Shortest Paths in Planar Graphs
  • One I/O per boundary set
  • Each boundary set is touched O(B) times
  • Once per vertex on the boundary of the region
  • O(N/B2) boundary sets ? O(N/B) I/Os

45
Planar Separator
  • Goal Compute a separator S of size
    whose removal partitions G into subgraphs of size
    at most h.
  • Basic idea
  • Compute hierarchy of log(DB) graphs of
    geometrically decreasing size using graph
    contraction
  • Compute a separator of the smallest graph
  • Undo the contractions and maintain the separator
    while doing this
  • Assumption M W(h log2 B)

46
Planar Separator
47
Planar Separator
  • Properties
  • All Gi are planar
  • Gi1 ? Gi/2
  • Every vertex in Gi1 represents only a constant
    number of vertices in Gi
  • Every vertex in Gi1 represents at most 2i2
    vertices in G0
  • r log2(DB) graphs G0,,Gr
  • Gr O(N/(DB))

48
Planar Separator
49
Planar Separator
  • Compute separator Sr of Gr
  • Sr Sr? partitions Gr into connected components
    of size at most h?log2(DB)
  • Takes O(Gr) O(N/B) I/Os AD96

50
Planar Separator
  • Compute Si from Si1
  • Let S?i be the set of vertices in Gi represented
    by the vertices in Si1
  • Connected components of Gi S?i have size at
    most c?h?log2(DB)
  • Partition every connected components of size more
    than h?log2(DB) into components of size
    h?log2(DB) ? separator Si?
  • Takes O(sort(Gi)) I/Os
  • Connected components O(sort(Gi))
  • Partitioning happens in internal memory
  • Total O(sort(N)) I/Os

51
Planar Separator
  • Separator S0 partitions G0 into connected
    components of size at most h?log2(DB)
  • Size of S0

52
Planar Separator
  • Compute a superset S of S0 so that no connected
    component of G S has size more than h
  • Partition every connected component of G S0
    separately in internal memory
  • Total number of extra separator vertices is
  • Extra cost O(sort(N)) I/Os

53
Building the Graph Hierarchy
  • Properties
  • All Gi are planar
  • Gi1 ? Gi/2
  • Every vertex in Gi1 represents only a constant
    number of vertices in Gi
  • Every vertex in Gi1 represents at most 2i2
    vertices in G0
  • Build Gi1 from Gi by
  • Contracting edges
  • Merging vertices of degree 2 with the same
    neighbors

54
Building the Graph Hierarchy
  • Iterative approach
  • Extract set of edges that can be contracted
  • Contract subset of these edges to reduce number
    of vertices by a factor of two
  • Repeat until no contractible edges remain
  • Problem
  • Standard graph contraction procedure may contract
    too many vertices into a single vertex.

55
Building the Graph Hierarchy
  • Solution
  • Compute maximal matching of contractible subgraph
  • Contract edges in the matching
  • New problem
  • We may not contract sufficient number of edges to
    reduce number of vertices by a constant factor
  • Two-stage contraction
  • Contract maximal matching
  • Contract edges between matched and unmatched
    vertices

56
Building the Graph Hierarchy
  • Why is this two-stage approach good?
  • No unmatched vertex remains in contractible
    subgraph
  • Every matched vertex represents at least two
    vertices before the contraction
  • Size of graph reduces by a factor of two
  • If a single iteration takes O(sort(Gi)) I/Os,
    the whole construction of Gi1 from Gi
    takesO(sort(Gi)) I/Os

57
A Single Contraction Phase
  • Maximal matching can be computed and contracted
    in O(sort(H)) I/Os, where H is the current
    contractible subgraph
  • Bipartite contraction
  • Takes O(sort(H)) I/Os using buffer tree as
    priority queue

58
Building the Graph Hierarchy
  • Lemma Graph Gi1 can be constructed from Gi in
    O(sort(Gi)) I/Os.
  • Corollary The whole graph hierarchy can be built
    in O(sort(G0)) O(sort(N)) I/Os.

59
Planar DFS
60
Planar DFS
s
61
Planar DFS
  • Observation Every cycle in the i-th layer is a
    boundary cycle of graph Gi.
  • Every bicomp of a layer is a cycle.

62
DFS in a Layer
63
Planar DFS
  • DFS in a single layer Hi takes O(sort(Hi))
    I/Os
  • Compute the bicomps
  • Root the bicomp tree
  • Remove one of the edges incident to parent
    cutpoint in each cycle
  • Total I/O-complexity O(sort(N))

64
Planar DFS
65
Planar DFS
66
Building the Face-on-Vertex Graph
67
Lower Bounds and Open Problems
  • Lower bounds
  • List ranking, BFS, DFS, and shortest paths
  • Connected and biconnected components
  • Open problems

68
Lower BoundsSplit Proximate Neighbors
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
69
Lower BoundsSplit Proximate Neighbors
  • Lemma Split proximate neighbors requires
    W(perm(N)) I/Os.
  • Total O(I(N) scan(N)) O(I(N))
  • I(N) W(perm(N))

70
Lower BoundsList Ranking
  • Consider general algorithms for weighted list
    ranking
  • Algorithm is only allowed to use associativity of
    sum operator
  • Algorithm can be made to have the following
    property
  • For every vertex v, v and succ(v) are both in
    main memory at some point during the course of
    the algorithm
  • Note The lower bound we show does not hold for
    unweighted list ranking or weighted list ranking
    over groups.

71
Lower BoundsList Ranking
  • When both copies of x are in main memory, move to
    buffer of size B
  • When buffer full, flush to disk
  • Split proximate neighbors could be solved
    inO(I(N) scan(N)) I/Os
  • I(N) W(perm(N))

72
Lower BoundsList Ranking, BFS, DFS, and Shortest
Paths
  • Theorem List ranking requires W(perm(N)) I/Os.
  • List ranking can be solved using BFS, DFS, or
    SSSP from the head of the list.
  • Theorem BFS, DFS, and SSSP require W(perm(N))
    I/Os.
  • Note Again, lower bound holds only for
    algorithms that compute distances from source
    only by adding path lengths.

73
Lower BoundsSegmented Duplicate Elimination
  • Let P ? N ? P2
  • Elements drawn from interval 2P1,3P
  • Construct Boolean array C2P1..3P s.t.Ci 1
    iff i ? S
  • Proposition Segmented duplicate elimination
    requires W(perm(N)) I/Os.

S
17
18
19
20
22
23
19
19
20
20
22
20
18
23
17
19
P/2
P/2
P/2
P/2
74
Lower BoundsConnected Components
17
18
19
20
22
23
19
19
20
20
22
20
18
23
17
19
S1
S2
S3
S4
17
1
18
19
2
20
21
3
22
23
4
24
  • Graph construction O(scan(N)) I/Os
  • V Q(P), E N

75
Lower BoundsConnected and Biconnected Components
  • Theorem Computing the connected components of a
    graph G (V,E) requires W(perm(E)) I/Os.

Theorem Computing the biconnected components of
a graph G (V,E) requires W(perm(E)) I/Os.
76
More Classes of Sparse Graphs
  • Grid graphs
  • Separators Size in O(sort(N))
    I/Os
  • BFS/SSSP O(sort(N))
  • DFS
  • Graphs of bounded treewidth
  • Separators O(N/h) in O(sort(N)) I/Os
  • BFS/SSSP O(sort(N))
  • DFS ???

77
Open Problems
  • Optimal separators for grid graphs
  • DFS
  • Grid graphs
  • Graphs of bounded treewidth
  • Semi-external shortest paths
  • Optimal connectivity
  • Optimal BFS, DFS, and shortest paths or lower
    bounds
  • Directed graphs
  • Topological sorting
  • Strongly connected components
Write a Comment
User Comments (0)
About PowerShow.com