Title: Disjoint set structures --for Operations over set (Reference: textbook, pp175-180)
1Disjoint set structures --for Operations over
set(Reference textbook, pp175-180)
- CS2223 Recitation 3
- March 30, 2005
- Song Wang
2Problem Description
- Given
- A set S with N objects, identified using number 1
to N. - Disjoint partitions (subsets) of the set S.
- Any item belongs to one partition
- No one item belongs to more than one partitions.
- What to do
- Find given an object, find which set contains
it. - Merge given two set, merge them into one set.
- Why
- Basic and frequently used functions for set
operations, like union, intersection, and etc. - Consequently, important problem for many other
algorithms, like finding the minimum spanning
tree.
3Preliminaries
- Data Structure for Set Tree
Parent Node denotes each set Smallest object as
the parent node (one choice)
4Preliminaries II
- Degraded Linked List Array to record parent only
Index 1 2 3 4 5 6 7 8 9
Array
5Solution 1 find1()
find1(7) 1--belongs to set 1 find1(2) 2belongs
to set 2
Function find1(x) return setx
6Solution 1 merge1()
Merge set 1 and 2
Procedure merge1(a,b) ilt- min (a, b) jlt-max
(a, b) for klt-1 to N do if setkj then
setklt-i
Scan
7Performance Analysis of find1() and merge1()
- Case Study n times of find and ltN-1 times of
merge. (n is comparable to N) - Function find1 takes constant time T(1)
- Procedure merge1 takes linear time T(N)
- Total n T(1)(N-1)T(N) T(N2) or T(n2)
8Can We do Better?
Merge set 1 and 2
9Solution 2 merge2()
Merge set 1 and 2
Procedure merge2(a,b) if altb then
setblt-a else setalt-b
Guarantee the root of the tree is the smallest
10Solution 2 find2()
find1(5) 1 Need traverse the whole path from
node 5 to the root node 1
Function find2(x) rlt-x while setr!r do
rlt-setr return r
Only for root, rsetr
11Performance Analysis of find2() and merge2()
- Case Study n times of find and ltN-1 times of
merge. (n is comparable to N) - Function find2 takes linear time T(N) in the
worst case. - Procedure merge2 takes constant time T(1)
- Total n T(N)(N-1)T(1) T(N2) or T(n2)
- No improvement!
12What is the Problem?
- The worst case linear tree
Find2(6)? Height of the tree is essential for
performance
13How to Avoid a Bad Merge Tree
Merge(1,4)
14Whos whose subtree?
- Tree t1 has height h1 and Tree t2 has height h2
- If h1lt h2 t1 becomes subtree of t2 and merged
trees height is h2 - If h1 h2 t1 becomes subtree of t2 and merged
trees height is h11 - The root of the tree is not always the smallest
node any more!
15Theorem 5.9.1, pp 177
- A tree containing k nodes has a height at most
log k - Proof by induction.
16Solution 3 merge3()
Procedure merge3(a,b) if heightaheightb
then heightalt-heighta1 setblt-a else
if heightagtheightb then setblt-a else
setalt-b
17Performance Analysis of find2() and merge3()
- Case Study n times of find and ltN-1 times of
merge. (n is comparable to N) - Function find2 takes ltlinear time T(logN) in the
worst case. - Procedure merge3 takes constant time T(1)
- Total n T(logN)(N-1)T(1) T(n log n)
- Some improvement
18Path Compression in find3()
- Intuitive explanation
- More fan-out of children, less height of the
tree.
Find3(20)
19Solution 3 find3()
Function find3(x) rlt-x while setr!r do
rlt-setr ilt-x while i!r do jlt-seti seti
lt-r ilt-j return r
First traverse of the path Find the root
Second traverse of the path Connect nodes on
path to root
20Performance Analysis of find3() and merge3()
- Case Study n times of find and ltN-1 times of
merge. (n is comparable to N) - Function find3 takes little more than constant
time. - Procedure merge3 takes constant time T(1)
- Total close to T(n)
- Best one!
21Summery
Find1() and merge1() Best for find, worst for
merge (height 1, always )
Find2() and merge2() Best for merge, worst for
find (height N, worst case)
Mixing above
Mixing above
Find2() and merge3() (height lgN, worst case)
Find3() and merge3() (height close to 1) Best for
both