Todays Material - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Todays Material

Description:

If A~B, then A and B are in the same equivalence class. Examples: ... Given a set of elements and some equivalence relation ~ between them, we want to ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 46
Provided by: cuneyta
Category:
Tags: keys | material | todays

less

Transcript and Presenter's Notes

Title: Todays Material


1
Todays Material
  • The dynamic equivalence problem
  • a.k.a. Disjoint Sets/Union-Find ADT
  • Covered in Chapter 8 of the textbook

2
Motivation
  • Consider the relation between integers
  • For any integer A, A A (reflexive)
  • For integers A and B, A B means that B A
    (symmetric)
  • For integers A, B, and C, A B and B C means
    that A C (transitive)
  • Consider cities connected by two-way roads
  • A is trivially connected to itself
  • A is connected to B means B is connected to A
  • If A is connected to B and B is connected to C,
    then A is connected to C

3
Equivalence Relationships
  • An equivalence relation R obeys three properties
  • reflexive for any x, xRx is true
  • symmetric for any x and y, xRy implies yRx
  • transitive for any x, y, and z, xRy and yRz
    implies xRz
  • Preceding relations are all examples of
    equivalence relations
  • What are not equivalence relations?

4
Equivalence Relationships
  • An equivalence relation R obeys three properties
  • reflexive for any x, xRx is true
  • symmetric for any x and y, xRy implies yRx
  • transitive for any x, y, and z, xRy and yRz
    implies xRz
  • What about lt on integers?
  • 1 and 2 are violated
  • What about on integers?
  • 2 is violated

5
Equivalence Classes and Disjoint Sets
  • Any equivalence relation R divides all the
    elements into disjoint sets of equivalent items
  • Let be an equivalence relation. If AB, then A
    and B are in the same equivalence class.
  • Examples
  • On a computer chip, if denotes electrically
    connected, then sets of connected components
    form equivalence classes
  • On a map, cites that have two-way roads between
    them form equivalence classes
  • What are the equivalence classes for the relation
    Modulo N applied to all integers?

6
Equivalence Classes and Disjoint Sets
  • Let be an equivalence relation. If AB, then A
    and B are in the same equivalence class.
  • Examples
  • The relation Modulo N divides all integers in N
    equivalence classes (for the remainders 0, 1, ,
    N-1)
  • Under Mod 5
  • 0 5 10 15
  • 1 6 11 16
  • 2 7 12
  • 3 8 13
  • 4 9 14
  • (5 equivalence classes denoting remainders 0
    through 4 when divided by 5)

7
Union and Find Problem Definition
  • Given a set of elements and some equivalence
    relation between them, we want to figure out
    the equivalence classes
  • Given an element, we want to find the equivalence
    class it belongs to
  • E.g. Under mod 5, 13 belongs to the equivalence
    class of 3
  • E.g. For the map example, want to find the
    equivalence class of Eskisehir (all the cities it
    is connected to)
  • Given a new element, we want to add it to an
    equivalence class (union)
  • E.g. Under mod 5, since 18 13, perform a union
    of 18 with the equivalence class of 13
  • E.g. For the map example, Ankara is connected to
    Eskisehir, so add Ankara to equivalence class of
    Eskisehir

8
Disjoint Set ADT
  • Stores N unique elements
  • Two operations
  • Find Given an element, return the name of its
    equivalence class
  • Union Given the names of two equivalence
    classes, merge them into one class (which may
    have a new name or one of the two old names)
  • ADT divides elements into E equivalence classes,
    1 E N
  • Names of classes are arbitrary
  • E.g. 1 through N, as long as Find returns the
    same name for 2 elements in the same equivalence
    class

9
Disjoint Set ADT Properties
  • Disjoint set equivalence property every element
    of a DS ADT belongs to exactly one set (its
    equivalence class)
  • Dynamic equivalence property the set of an
    element can change after execution of a union

Disjoint Set ADT
  • Example
  • Initial Classes 1,4,8, 2,3, 6, 7,
    5,9,10
  • Name of equiv. class underlined

1,4,8 6
7
5,9,10
2,3
6
Find(4)
8
2,3,6
2,3
Union(6, 2)
10
Disjoint Set ADT Format Definition
  • Given a set U a1, a2, , an
  • Maintain a partition of U, a set of subsets (or
    equivalence classes) of U denoted by S1, S2, ,
    Sk such that
  • each pair of subsets Si and Sj are disjoint
  • together, the subsets cover U
  • each subset has a unique name
  • Union(a, b) creates a new subset which is the
    union of as subset and bs subset
  • Find(a) returns the unique name for as subset

11
Implementation Ideas and Tradeoffs
  • How about an array implementation?
  • N element array A Ai holds the class name for
    element i
  • E.g. Assume 8 43
  • pick 3 as class name and set A8 A4 A3
    3

Sets 0, 1, 2, 5, 9, 3, 4,
8, 6, 7
  • Running time for Find(i)?

O(1) (just return Ai)
  • Running time for Union(i, j)?

O(N)
12
Implementation Ideas and Tradeoffs
  • How about linked lists?
  • One linked list for each equivalence class
  • Class name head of list

E.g. Sets 0, 1, 2, 5, 9,
3, 4, 8, 6, 7
  • Running time for Union(i, j) ?
  • E.g. Union(1, 3)
  • O(1) Simply append one list to the end of the
    other

0
1
2
5
9
3
4
8
  • Running time for Find(i) ?
  • O(N) Must scan all lists in the worst case

6
7
13
Implementation Ideas and Tradeoffs
  • Tradeoff between Union-Find can we do both in
    O(1) time?
  • N-1 Unions (the maximum possible) and M Finds
  • O(N2 M) for array
  • O(N MN) for linked list implementation
  • Can we do this in O(M N) time?

14
Towards a new Data Structure
  • Intuition Finding the representative member (
    class name) for an element is like the opposite
    of searching for a key in a given set
  • So, instead of trees with pointers from each node
    to its children, lets use trees with a pointer
    from each node to its parent
  • Such trees are known as Up-Trees

15
Up-Tree Data Structure
  • Each equivalence class (or discrete set) is an
    up-tree with its root as its representative
    member
  • All members of a given set are nodes in that
    sets uptree

NULL
NULL
NULL
c, f
h
a, d, g, b, e
Up-Trees are not necessarily binary
16
Implementing Up-Trees
NULL
NULL
NULL
NULL
  • Forest of up-trees can easily be stored in an
    array
  • (call it up)
  • upX parent of X
  • -1 if root

g
h, i
c, f
a, b, d, e
Array up
17
Example Find
NULL
NULL
NULL
NULL
  • Find(x) Just follow parent pointers to the root
  • Find(e) a
  • Find(f) c
  • Find(g) g

g
h, i
c, f
a, b, d, e
Array up
Find(e)
18
Implementing Find(x)
define N 9 int upN / Returns setid of
x/ int Find(int x) while (upx gt 0)
x upx / end-while / return x /
end-Find /
NULL
NULL
NULL
NULL
g
h, i
c, f
a, b, d, e
Running time?
O(maxHeight)
Array up
Find(4)
19
Recursive Find(x)
define N 9 int upN / Returns setid of
x/ int Find(int x) if (upx lt 0)
return x return Find(upx) / end-Find /
NULL
NULL
NULL
NULL
g
h, i
c, f
a, b, d, e
Array up
Find(4)
20
Example Union
NULL
NULL
NULL
NULL
  • Union(x, y) Just hang one root from the other!
  • Union(c, a)

g
h, i
a, b, d, e, c, f
0(a)
Array up
2
-1
21
Implementing Union(x, y)
define N 9 int upN / Joins two sets / int
Union(int x, int y) assert(upx lt 0)
assert(upy lt 0) upy x / end-Union /
NULL
NULL
NULL
NULL
g
h, i
a, b, d, e, c, f
Running time?
O(1)
Array up
21
22
MakeSet() Creating initial sets
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
NULL
a
g
h
i
b
e
f
c
d
define N 9 int upN / Make initial sets
/ void MakeSets() int i for (i0 iltN
i) upi -1 / end-for / /
end-MakeSets /
23
Detailed Example
Initial Sets
a
g
h
i
b
e
f
c
d
Union(b, e)
24
Detailed Example
a
d
a
c
g
h
i
d
f
b, e
Union(a, d)
25
Detailed Example
a
b
c
g
h
i
f
d
b, e
a, d
Union(a, b)
26
Detailed Example
a
g
h
i
c
f
c
g
h
i
f
d
b
e
a, d, b, e
Union(h, i)
27
Detailed Example
a
g
h
c
f
c
g
f
d
b
i
e
a, d, b, e
h, i
Union(c, f)
28
Detailed Example
a
g
h
c
g
d
b
i
f
c, f
e
h, i
a, d, b, e
Union(c, a)
Q Can we do a better job on this union for
faster finds in the future?
29
Implementation of Find Union
define N 9 int upN / Joins two sets / int
Union(int x, int y) assert(upx lt 0)
assett(upy lt 0) upy x / end-Union /
Running time
O(1)
Height depends on previous unions Best Case 1-2,
1-4, 1-5, - O(1) Worst Case 2-1, 3-2, 4-3, -
O(N)
Q Can we do a better?
30
Lets look back at our example
a
g
h
c
g
d
b
i
f
c, f
e
h, i
a, d, b, e
Union(c, a)
Q Can we do a better job on this union for
faster finds in the future? How can we make the
new tree shallow?
31
Speeding up Find Union-by-Size
  • Idea In Union, always make the root of the
    larger tree the new root union-by-size

Initial Sets
32
Trick for Storing Size Information
  • Instead of storing -1 in root, store up-tree size
    as negative value in root node

g
h, i
c, f
a, b, d, e
Array up
33
Implementing Union-by-Size
define N 9 int upN / Joins two sets.
Assumes x y are roots / int Union(int x, int
y) assert(upx lt 0) assert(upy lt 0)
if (upx lt upy) // x is bigger. Join y to
x upx upy upy x else
// y is bigger. Join x to y upy
upx upx y / end-else / /
end-Union /
Running time?
O(1)
33
34
Running Time for Find with Union-by-Size
  • Finds are O(MaxHeight) for a forest of up-trees
    containing N nodes
  • Theorem Number of nodes in an up-tree of height
    h using union-by-size is 2h
  • Pick up-tree with MaxHeight
  • Then, 2MaxHeight N
  • MaxHeight log N
  • Find takes O(log N)
  • Proof by Induction
  • Base case h 0, tree has 20 1 node
  • Induction hypothesis Assume true for h lt h'
  • Induction Step New tree of height h' was formed
    via union of two trees of height h'-1 .
  • Each tree then has 2h-1 nodes by the
    induction hypothesis
  • So, total nodes 2h-1 2h-1 2h
  • Therefore, True for all h

35
Union-by-Height
  • Textbook describes alternative strategy of
    Union-by-height
  • Keep track of height of each up-tree in the root
    nodes
  • Union makes root of up-tree with greater height
    the new root
  • Same results and similar implementation as
    Union-by-Size
  • Find is O(log N) and Union is O(1)

36
Can we make Find go faster?
  • Can we make Find(g) do something so that future
    Find(g) calls will run faster?
  • Right now, M Find(g) calls run in total O(MlogN)
    time
  • Can we reduce this to O(M)?

h, i
c, f
a, b, d, e, g
  • Idea Make Find have side-effects so that future
    Finds will run faster.

37
Introducing Path Compression
  • Path Compression Point everything along path of
    a Find to root
  • Reduces height of entire access path to 1
  • Finds get faster!

Find(g)
h, i
c, f
a, b, d, e, g
38
Another Path Compression Example
Find(g)
c, f
a, b, d, h, e, i, g
39
Implementing Path Compression
  • Path Compression Point everything along path of
    a Find to root
  • Reduces height of entire access path to 1
  • Finds get faster!

Running time
O(MaxHeight)
define N int upN / Returns setid of x
/ int Find(int x) if (upx lt 0) return
x int root Find(upx) upx root /
Point to the root /
return root / end-Find /
  • But, what happens to the tree height over time?
  • It gets smaller
  • Whats the total running time if we do M Finds?
  • Turns out this is equal to O(MInvAccerman(M, N))

40
Running time of Find with Path Compression
  • Whats the total running time if we do M Finds?
  • Turns out this is equal to O(MInvAccerman(M, N))
  • InverseAccerman(M, N) lt 4 for all practical
    values of M and N
  • So, total running time of M Finds lt 4MO(M)
  • Meaning that the amortized running time of Find
    with path compression is O(1)

41
Summary of Disjoint Set ADT
  • The Disjoint Set ADT allows us to represent
    objects that fall into different equivalence
    classes or sets
  • Two main operations Union of two classes and
    Find class name for a given element
  • Up-Tree data structure allows efficient array
    implementation
  • Unions take O(1) worst case time, Finds can take
    O(N)
  • Union-by-Size (or by-Height) reduces worst case
    time for Find to O(log N)
  • If we use both Union-by-Size/Height Path
    Compression
  • Any sequence of M Union/Find operations results
    in O(1) amortized time per operation (for all
    practical purposes)

42
Applications of Disjoint Set ADT
  • Disjoint sets can be used to represent
  • Cities on a map (disjoint sets of connected
    cities)
  • Electrical components on chip
  • Computers connected in a network
  • Groups of people related to each other by blood
  • Textbook example Maze generation using
    Unions/Finds
  • Start with walls everywhere and each cell in a
    set by itself
  • Knock down walls randomly and Union cells that
    become connected
  • Use Find to find out if two cells are already
    connected
  • Terminate when starting and ending cell are in
    same set i.e. connected (or when all cells are in
    same set)

43
Disjoint Set ADT Declaration Operations
class DisjointSet private int up // Up
links array int N // Number of
sets public DisjointSet(int n) // Creates N
sets DisjointSet()delete up int Find(int
x) void Union(int x, int y)
44
Operations DisjointSet, Find
/ Create N sets / DisjointSetDisjointSet(int
n) int i N n up new intN
for (i0 iltN i) upi -1
//end-DisjointSet
/ Returns setid of x / int DisjointSetFind(i
nt x) if (upx lt 0) return x int root
Find(upx) upx root / Point to
the root / return root /
end-Find /
45
Operations Union (by size)
/ Joins two sets. Assumes x y are roots / int
DisjointSetUnion(int x, int y) assert(upx
lt 0) assert(upy lt 0) if (upx lt
upy) // x is bigger. Join y to x upx
upy upy x else // y is
bigger. Join x to y upy upx upx
y / end-else / / end-Union /
Write a Comment
User Comments (0)
About PowerShow.com