Title: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis
1I/O-Efficient Batched Union-Find and Its
Applications to Terrain Analysis
- Pankaj K. Agarwal, Lars Arge, and Ke Yi
- Duke University
- University of Aarhus
2The Union-Find Problem
- A universe of N elements x1, x2, , xN
- Initially N singleton sets x1, x2 , , xN
- Each set has a representative
- Maintain the partition under
- Union(xi, xj) Joins the sets containing xi and
xj - Find(xi) Returns the representative of the set
containing xi
3The Solution
representatives
d
h
i
p
b
j
a
f
l
s
r
c
z
k
e
g
m
n
Union(d, h)
Find(n)
h
h
d
f
l
d
f
l
n
m
b
j
a
b
j
a
m
link-by-rank
path compression
e
g
e
g
n
4Complexity
- O(N a(N)) for a sequence of N union and find
operations Tarjan 75 - a() Inverse Ackermann function (very slow!)
- Optimal in the worst case Tarjan79, Fredman and
Saks 89 - Batched (Off-line) version
- Entire sequence known in advance
- Can be improved to linear on RAM Gabow and
Tarjan 85 - Not possible on a pointer machine Tarjan79
5Simple and Good, as long as
- The entire data structure fits in memory
6The I/O Model
Main memory of size M
One I/O transfers B items between memory and disk
Disk of infinite size
7Our Results
- An I/O-efficient algorithm for the batched
union-find problem using O(sort(N)) O(N/B
logM/B(N/B)) I/Os expected - Same as sorting
- optimal in the worst case
- A practical algorithm using O(sort(N) log(N/M))
I/Os - Applications to terrain analysis
- Topological persistence O(sort(N)) I/Os
- Contour trees O(sort(N)) I/Os
8I/O-Efficient Batched Union-Find
- Assumption No redundant unions
- Each union must join two different sets
- Will remove later
- Two-stage algorithm
- Convert to interval union-find
- Compute an order on the elements s.t. each union
joins two adjacent sets - Solve batched interval union-find
9Union Graph
(Tree if no redundant unions)
1 Union(d, g) 2 Union(a, c) 3 Union(r, b) 4
Union(a, e) 5 Union(e, i) 6 Union(r, a) 7
Union(a, d) g 8 Union(d, h)
r 9 Union(b, f)
r
r
9
3
6
6
3
f
a
b
a
b
4
4
2
9
2
7
7
c
d
e
f
c
d
e
1
5
8
1
5
g
h
i
g
i
8
h
Equivalent union trees
10Transforming the Union Tree
r
r
r
7
3
3
3
6
6
6
8
8
a
b
a
b
h
a
b
d
h
4
2
9
2
9
9
4
4
7
7
1
2
c
d
e
f
c
d
e
f
c
e
f
g
1
5
8
1
5
5
i
g
h
i
g
i
r
7
9
6
3
8
a
b
d
f
h
Weights along root-to-leaf path decrease
1
2
4
5
c
e
g
i
11Formulating as a Batched Problem
r
3
6
a
b
r
7
4
9
2
9
6
3
7
8
a
b
d
f
h
c
d
e
f
1
2
1
5
8
4
5
c
e
g
i
g
h
i
For each edge, find the lowest ancestor edgewith
a higher weight
12Cast in a Geometry Setting
r
3
9
6
8
a
b
7
4
2
9
7
6
c
d
e
f
5
1
5
8
4
3
g
h
i
2
1
Euler Tour
x positions in the tour y weight
In O(sort(N)) I/Os Chiang et al. 95
13Cast in a Geometry Setting
r
3
9
6
8
a
b
7
4
2
9
7
6
c
d
e
f
5
1
5
8
4
3
g
h
i
2
1
For each edge, find the lowest ancestor edgewith
a higher weight
For each segment, find the shortest segment above
and containing it
14Distribution Sweeping
M/B vertical slabs
checkedrecursively
Total cost O(sort(N))
checked here
15In-Order Traversal
r
3
9
6
7
Weights along root-to-leaf path decrease
8
a
b
d
f
h
1
2
4
5
c
e
g
i
- At u, with child u1,, uk (in increasing order
of weight) - Recursively visit subtree at u1
- Return u
- For i2 ,, kRecursively visit subtree at ui
b
r
a
c
e
i
g
d
h
f
Claim this traversalproduces the right order
16Solving Interval Union-Find
Union x two operands y time stamp Find x
operand y time stamp
representative
17Solving Interval Union-Find
Union x two operands y time stamp Find x
operand y time stamp
Four instances of batched ray shooting O(sort(N))
18Solving Interval Union-Find
Union x two operands y time stamp Find x
operand y time stamp
Four instances of batched ray shooting O(sort(N))
19Handling Redundant Unions
- Union tree becomes a general graph
- Compute the minimum spanning tree
- O(sort(N)) I/Os (randomized) Chiang et al. 95
O(sort(N) loglog B) I/Os (deterministic) Arge et
al. 04 - Deterministic O(sort(N)) I/Os if graph is planar
- Only MST edges are non-redundant
20Applications
- Topological Persistence
- Contour Trees
21Application Topological Persistence
- Introduced by Edelsbrunner et al. 2000
- Measure importance on a surface
- Feature extraction
- Topological de-noising
- Many applications
- Surface modeling
- Shape analysis
- Terrain analysis
- Computational Biology
22Topological Persistence Illustrated
23Formulated as Batched Union-Find
- Represented as a triangulated mesh
- Consider minimum-saddle pairs
- When reach
- A minimum or maximum do nothing
- A regular point u Issue union(u,v) for a lower
neighbor v - A saddle u let v and w be nodes from us two
connected pieces in its lower link Issue
find(v), find(w), union(u,v), union(u,w)
lower link
24Experiment 1Random Union-Find
128MB memory
25Experiment 2 Topological Persistence on Terrain
Data
Neuse River Basin of North Carolina 0.5
billion points
26Experiment 2 Topological Persistence on Terrain
Data
128MB memory
Entire data set (0.5b) IM fails and EM takes 10
hours
27Contour Trees
28Summary
- An I/O-efficient algorithm for the batched
union-find problem using O(sort(N)) O(N/B
logM/B(N/B)) I/Os - optimal in the worst case
- A practical algorithm using O(sort(N) log(N/M))
I/Os - Applications to terrain analysis
- Topological persistence O(sort(N)) I/Os
- Contour trees O(sort(N)) I/Os
- Open Question
- On-line case Can we get below O(N a(N)) I/Os?
29Thank you!
30Previous Results
- Directly maintain contours
- O(N log N) time van Kreveld et al. 97
- Needs union-split-find for circular lists
- Do not extend to higher dimensions
- Two sweeps by maintaining components, then merge
- O(N log N) time Carr et al. 03
- Extend to arbitrary dimensions
31Join Tree and Split Tree
Qualified nodes
9
9
9
9
8
8
8
8
7
7
7
7
6
6
6
6
5
5
5
5
4
4
4
4
3
3
3
3
2
2
1
1
1
1
Join tree
Split tree
Join tree
Split tree
32Final Contour Tree
Hard to BATCH!
9
9
9
8
8
8
7
7
7
6
6
6
5
5
5
4
4
4
3
3
3
2
2
2
1
1
1
Join tree
Split tree
Contour tree
33Another Characterization
Let w be the highest node that is a descendant of
v in join tree and ancestor of u in split tree,
(u, w) is a contour tree edge
9
9
9
Now can BATCH!
8
8
8
u
7
7
u
7
u
6
6
6
v
v
u
5
5
5
w
w
w
4
4
4
3
3
3
2
2
2
1
1
1
Join tree
Split tree
Contour tree
34Map to Rectangles
9
9
w
8
8
u
7
7
u
u
6
6
v
v
5
5
w
w
4
4
v
3
3
2
2
1
1
Can be solved in O(sort(N)) I/Os (practical, too)
Join tree
Split tree
35Topological Persistence
36Label Nodes with Intervals
9
8
7
6
5
4
3
2
1
Using Euler tour (O(sort(N) I/Os)
37Map to Rectangles
9
9
w
8
8
u
7
7
u
u
6
6
v
v
5
5
w
w
4
4
v
3
3
2
2
1
1
Can be solved in O(sort(N)) I/Os (practical, too)
Join tree
Split tree
38Formulated as Batched Union-Find
- Represented as a triangulated mesh
- Consider minimum-saddle pairs
- When reach
- A minimum or maximum do nothing
- A regular poin u Issue union(u,v) for a lower
neighbor v - A saddle u let v and w be nodes from us two
connected pieces in its lower link Issue
find(v), find(w), union(u,v), union(u,w)
lower link
39Experiment 1Random Union-Find
40Experiment 2 Topological Persistence on Terrain
Data
41Experiment 2 Topological Persistence on Terrain
Data