Disjoint%20Sets - PowerPoint PPT Presentation

About This Presentation
Title:

Disjoint%20Sets

Description:

In the discussion that follows: n is the total number of elements (in ... Similar to lists, simpler to implement if we know the number of elements in advance. ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 18
Provided by: vdou8
Category:
Tags: 20sets | disjoint

less

Transcript and Presenter's Notes

Title: Disjoint%20Sets


1
Disjoint Sets
  • Set a collection of (distinguishable) elements
  • Two sets are disjoint if they have no common
    elements
  • Disjoint-set data structure
  • maintains a collection of disjoint sets
  • each set has a representative element
  • supported operations
  • MakeSet(x)
  • Find(x)
  • Union(x,y)

2
Disjoint Sets
  • Used in applications requiring the partition of a
    set into equivalence classes.
  • Maze generation
  • Several graph algorithms
  • (e.g. Kruskal's algorithm for minimum spanning
    trees)
  • Compiler algorithms
  • Equivalence of finite automata

3
Disjoint Sets
  • Major operations
  • MakeSet(x)
  • Given an object x, create a set out of it. The
    representative of the set is x
  • Find(x)
  • Given an object x, return the representative of
    the set containing x
  • Union(x)
  • Given elements x, y, merge the sets they belong
    to.
  • The original sets are destroyed.
  • The new set has a new representative

4
Disjoint Sets
  • In the discussion that follows
  • n is the total number of elements (in all sets)
  • m is the total number of operations performed (a
    mix of MakeSet, Union, Find operations)
  • m is at least equal to n since there must be a
    MakeSet operation for each element.
  • The maximum number of Union operations that may
    be performed is n-1.
  • We will perform amortized analysis.

5
Disjoint Sets
  • Implementation 1 Using linked lists
  • The head of the list is also the representative
  • Each node contains
  • an element
  • a pointer to the next node
  • a pointer to the representative
  • Why? Because this will speed up the Find operation

6
Disjoint Sets
  • Implementation 1 Using linked lists
  • MakeSet(x)
  • Create a list with one node, x
  • Time for one operation O(1)
  • Find(x)
  • Assuming we already have a pointer to x (), just
    return the pointer to the representative
  • Time for one operation O(1)

() usually, we have a vector of pointers to the
individual nodes
7
Disjoint Sets
  • Implementation 1 Using linked lists
  • Union(x, y)
  • Perform Find(x) to find x's representative, rx
  • Perform Find(y) to find y's representative, ry
  • Append ry's list to the end of rx's list
  • rx becomes the representative of the new set.
  • The elements that used to be in ry's list should
    have their pointers to the representative
    updated.
  • Idea 1 Do a lazy update set ry's pointer to rx
    and leave the rest the way they are. This will
    make Union faster but will slow down the Find
    operation.
  • Idea 2 Update all applicable pointers. This
    will maintain the constant Find() time.

8
Disjoint Sets
  • Implementation 1 Using linked lists
  • Union(x, y)
  • A sequence of m operations may take O(mn2) time
  • How? Given elements 1, 2, 3, ..., n, do Union(1,
    2), Union(3, 1), Union(4, 1), etc.At step i,
    we attach a list of length i to a list of length
    1, thus updating i pointers to the new
    representative. After n-1 unions, we'll have a
    single set and we will have performed O(n2)
    pointer updates.
  • So let's be smart about it
  • Keep track of the length of each list and always
    append the shorter list to the longer one.

9
Disjoint Sets
  • Implementation 1 Using linked lists
  • Union(x, y)
  • A sequence of m operations where all unions
    append the shorter list to the longer one takes
    O(mnlgn) time
  • Why? Because with each union we attach a list of
    length i to a list of length at least i, thus
    doubling the length of the list. By the time we
    get a single set containing all elements, each
    element's pointer to the representative will have
    been updated lgn times, thus giving us a total of
    nlgn pointer updates.

10
Disjoint Sets
  • Implementation 2 Using arrays
  • Maintain an array of size n
  • Cell i of the array holds the representative of
    the set containing i.
  • Similar to lists, simpler to implement if we know
    the number of elements in advance.

11
Disjoint Sets
  • Implementation 3 Using trees
  • Each set is represented by a tree structure where
    every node has a pointer to its parent.
  • This tree is called an up-tree
  • The root is the representative of the set
  • The elements are not in any particular order.

12
Disjoint Sets
  • Implementation 3 Using trees
  • MakeSet(x)
  • Create a tree containing only the root, x
  • Time for one operation O(1)
  • Find(x)
  • Follow the parent pointers to the root.
  • Time for one operation O(depth of node)
  • Could be up to O(n)

13
Disjoint Sets
12
5
3
9
8
1
2
14
Disjoint Sets
  • Implementation 3 Using trees
  • Union(x, y)
  • Perform Find(x) to locate the representative of
    x, sx
  • Perform Find(y) to locate the representative of
    y, sy
  • Make sy a child of sx
  • Danger if we are not smart about it, our tree
    may end up looking like a list

1
1
1
2
1
2


3

4
2
2
3
3
4
15
Disjoint Sets
  • Implementation 3 Using trees
  • Union(x, y)
  • Always make the smaller tree a child of the
    larger tree.
  • How do we define "smaller"?
  • Heuristic 1 Union-by-weight
  • Smaller fewer nodes
  • Store number of nodes at the representative
  • Add the two weights when performing a union
  • Heuristic 2 Union-by-height
  • Smaller shorter
  • Store height at the representative
  • The height increases only when two trees of equal
    height are united.

16
Disjoint Sets
  • Implementation 3 Using trees
  • Union(x, y)
  • The height of a tree is at most logn1 where n is
    the number of elements in the tree.
  • We can do better than that!
  • Optimizing the union through path compression.
  • Our goal is to minimize the height of the tree
  • Every time we perform a Find(z) operation, we
    make all nodes on the path from the root to z
    immediate children of the root.
  • When path compression is performed, a sequence of
    m operations takes O(mlgn).

17
Disjoint Sets
  • Implementation 3 Using trees
  • Union(x, y)
  • Path compression and union-by-weight can be
    performed at the same time.
  • Path compression and union-by-height- can be
    performed at the same time.
  • It's more complex since path compression changes
    the height of the tree.
  • We usually prefer to estimate the height instead
    of computing it exactly. We then talk about
    union-by-rank with path compression.
  • When we perform union-by-weight/height with path
    compression, a sequence of m operations is almost
    linear in m
Write a Comment
User Comments (0)
About PowerShow.com