IO Efficient Minimum Spanning Tree Algorithm - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

IO Efficient Minimum Spanning Tree Algorithm

Description:

The connectivity problems in IO efficient manner (pre-knowledge for MST in external memory) ... For the unchecked edge (x, y) with the smallest weight, if L(x)!=L(y) ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 41
Provided by: course2
Category:

less

Transcript and Presenter's Notes

Title: IO Efficient Minimum Spanning Tree Algorithm


1
IO Efficient Minimum Spanning Tree Algorithm
  • Dawei Chen
  • 20080415

2
Outline
  • What is Minimum Spanning Tree (MST)
  • MST algorithms in internal memory
  • The connectivity problems in IO efficient manner
    (pre-knowledge for MST in external memory)
  • IO Efficient MST algorithm

3
Outline
  • What is Minimum Spanning Tree (MST)
  • MST algorithms in internal memory
  • The connectivity problems in IO efficient manner
    (pre-knowledge for MST in external memory)
  • IO Efficient MST algorithm

4
Minimum Spanning Tree (MST)
  • Spanning Tree (ST) Given a connected, undirected
    graph, a spanning tree of it is a subgraph which
    is a tree and contains all its vertices.
  • Minimum Spanning Tree (MST) Given a connected,
    undirected graph and assign weights to all its
    edges, an MST is an ST whose sum of the weights
    of all its edges is not larger than any of other
    STs.

5
Minimum Spanning Tree (MST)
6
Outline
  • What is Minimum Spanning Tree (MST)
  • MST algorithms in internal memory
  • The connectivity problems in IO efficient manner
    (pre-knowledge for MST in external memory)
  • IO Efficient MST algorithm

7
MST algorithms in internal memory
  • Prims and Kruskal's algorithm
  • Only introduce Kruskals algorithm

It works as follows 1.create a forest F (a set
of trees), where each vertex in the graph is a
separate tree 2.create a set S containing all
the edges in the graph 3.while S is nonempty
a. remove an edge with minimum weight from S
b. if that edge connects two different trees,
then add it to the forest, combining two trees
into a single tree c. otherwise discard that
edge
8
MST algorithms in internal memory
This is our original graph. The numbers near the
arcs indicate their weight. None of the arcs are
highlighted.
9
MST algorithms in internal memory
AD and CE are the shortest arcs, with length 5,
and AD has been arbitrary chosen, so it is
highlighted.
10
MST algorithms in internal memory
CE is now the shortest arc that does not form a
cycle, with length 5, so it is highlighted as the
second arc.
11
MST algorithms in internal memory
The next arc, DF with length 6, is highlighted
using much the same method.
12
MST algorithms in internal memory
The next-shortest arcs are AB and BE, both with
length 7. AB is chosen arbitrarily, and is
highlighted. The arc BD has been highlighted in
red, because there already exists a path (in
green) between B and D, so it would form a cycle
(ABD) if it were chosen.
13
MST algorithms in internal memory
The process continues to highlight the
next-smallest arc, BE with length 7. Many more
arcs are highlighted in red at this stage BC
because it would form the loop BCE, DE because it
would form the loop DEBA, and FE because it would
form FEBAD.
14
MST algorithms in internal memory
Finally, the process finishes with the arc EG of
length 9, and the minimum spanning tree is found.
15
MST algorithms in internal memory
  • Complexity of Kruskals algorithm O(ElogE)
  • Proof Not presented here.

16
Outline
  • What is Minimum Spanning Tree (MST)
  • MST algorithms in internal memory
  • The connectivity problems in IO efficient manner
    (pre-knowledge for MST in external memory)
  • IO Efficient MST algorithm

17
Connectivity Problem
  • Connectivity Problem is to compute number of the
    connected parts (components) of a given graph

18
Connectivity Problem
  • Connectivity Problem is equivalent to the
    labeling problem if two vertices are in the same
    component, mark them as the same label.

19
SemiExternalConnectivity
  • SemiExternalConnectivity algorithm assume all
    vertices can be loaded to the main memory, i.e.
    VltM

Let L(x) denotes the label of x. L(x) is a
number. For each edge (x, y), if L(x)!L(y)
For every vertex m, if L(m)L(x) or L(m)L(y),
then let L(m)minL(x),L(y)
20
SemiExternalConnectivity
  • Black solid line have been checked.
  • Dash line have not been checked.
  • Red solid line be checking

21
SemiExternalConnectivity
  • Correctness of SemiExternalConnectivity It is
    obvious.
  • Complexity O(VE)

22
FullExternalConnectivity
  • FullExternalConnectivity algorithm for graph G

If VltM, then apply SemiExternalConnectivity Else
Let wv denotes the smallest neighbor of vertex
w. Create a subgraph H of G that constructed by
edges (w, wv), for all w in G. Compress the
connected part to a single vertex. Recursively
run FullExternalConnectivity algorithm.
23
FullExternalConnectivity
  • The compress procedure
  • Lines edges in GRed lines edges in H (in G
    also)

24
FullExternalConnectivity
  • Correctness obvious
  • Comlexity

25
Outline
  • What is Minimum Spanning Tree (MST)
  • MST algorithms in internal memory
  • The connectivity problems in IO efficient manner
    (pre-knowledge for MST in external memory)
  • IO Efficient MST algorithm

26
IO Efficient MST algorithm
  • SemiExternal case similar as the connectivity
    problem.
  • FullExternal case similar as the connectivity
    problem, too.

27
SemiExternal case
  • All the vertices can be loaded to memory, i.e.
    VltM.
  • Similar with connectivity problem.

28
SemiExternal case
  • The modifications every time we just check the
    edges with the smallest weight.
  • S is the result edge set.

Let L(x) denotes the label of x. L(x) is a
number. For the unchecked edge (x, y) with the
smallest weight, if L(x)!L(y) For every
vertex m, if L(m)L(x) or L(m)L(y), then let
L(m)minL(x),L(y), and add (x, y) to S
29
SemiExternal case
  • Correctness obvious because it is actually the
    Kruskal algorithm.
  • Complexity O(sort(E))

30
FullExternal case
  • Similar with connectivity problem.
  • The modification during construction of subgraph
    H of G, we use (w, wv) where (w, wv) is the
    lowest weighted edges of w, rather than that wv
    is ws smallest neighbor.
  • The final H is the result MST.

31
FullExternal case
  • The construction of H
  • Lines edges in GRed lines edges in H (in G
    also)

32
FullExternal case
  • Correctness prove later.
  • Complexity

There are optimizations but not presented here.
33
Proof of FullExternal
  • First of all, we need to show that, with such
    construction method, the result subgraph H is a
    tree.
  • We will prove this by showing that there will be
    no cycles in H.
  • To simplify the problem, we assume that any two
    the weights are different.

34
Proof of FullExternal
  • If there is a cycle in H
  • It is obviously that a1 is the lowest weighted
    edges of either vertex 1 or vertex 2. Similar to
    a2, a3 an
  • Assume a1lta2, then a2 is not vertex2s lowest
    weighted edges, so a2 must be vertex3s. So a2lta3
  • Since a2lta3, similar as above, we will get a3lta4.
  • So we have a1lta2lta3ltanlta1, which is impossible.

35
Proof of FullExternal
  • Further, for a tree T which is an MST of G, we
    prove
  • If not, there must be an edge (v, wv) that in H
    but not in T. Further, since H and T are both ST,
    the number of edges are the same. So there must
    be an edge (x,y) that in T but not in H.

36
Proof of FullExternal
  • Edge (x, y) is not random selected. W.l.o.g, we
    assume that in T, the path from y to v are the
    same as in H. (note that y and v may be the same
    vertex)

37
Proof of FullExternal
  • Assume the path from v to y is (a1, a2 an)
    where a1v and any. Because (v, wv) is the
    lowest weighted edge of v, so (v, wv)lt(a1, a2),
    where va1. Similar we have (a1,a2)lt(a2,a3)ltlt(an-
    1,an)lt(x,y). So we get (v, wv)lt(x,y)

38
Proof of FullExternal
  • Then in T, we replace (x,y) by (v, wv), we get
    another ST(), whose sum of edges is smaller than
    T, which conflicts with the fact that T is an
    MST.
  • So we have
  • Further, because of the definition of MST, H is
    an MST of G

() please note that for a connected graph G, G
is a tree if and only if VE1. So since E
does not change, a tree will remain a tree,
provided that it is still connected.
39
Reference
  • Norbert Zeh, I/O-Efficient Graph Algorithms,
    section 5.4

40
QA
Write a Comment
User Comments (0)
About PowerShow.com