Title: Optimal Planar Point Enclosure Indexing
1Optimal Planar Point Enclosure Indexing
- Lars Arge, Vasilis Samoladas and Ke Yi
- Department of Computer Science
- Duke University
- Technical University of Crete
2Two Dual Problems
Range searching
Point enclosure
v
v
Internal memory External memory
v
?
3Outline
- Previous results in internal memory
- Computation models in external memory
- Previous results in external memory
- Our lower bound result
- Matching upper bound
- Conclusions
4Previous Results Internal Memory
- Computation model Pointer machine
- Range searching (T is the output size)
- O(N) space, O(NeT) time (BM 80)
- O(N logN / loglogN) space, O(logNT) time
Chazelle 88 - Tight for O(logcNT) query structures, Chazelle
90 - Can do better on a RAM
- Other tradeoffs
- Point enclosure Chazelle 86
- ?(N) space, ?(logNT) time
- Optimal in both space and time
5External Memory Models
- External pointer machine
- Natural generalization of the internal pointer
machine - Each node contains B data objects
- Out-degree 2 ?B
- Bounding-volume hierarchy (Non-replicating index
structure) - Tree structure
- Each object is stored only once
- Indexability model HKP 97
D
Block I/O
M
P
6External Memory Models
- Indexability model
- No structure at all!
- Only models layout of data
- Each block contains B data objects
- Can magically find the smallest set ? of blocks
whose union contains all results - Cost is defined to be ?
Indexability model
1D range searching
External pointer machine All other known
results
Bounding volume hierarchy R-trees, kd-trees
7Previous Results External Memory
- Range searching (nN/B)
- Similar to internal memory, tradeoff between
space and time - O(logBnT/B) query time
- O(n log n / loglogBn) space ASV 99
- Tight in external pointer machine SR 95
- Improved to indexability model ASV 99
- O(n) space
- O( ) time kdB-tree, GI 99, KS
99 - Tight in bounding-volume hierarchies
- Can do O(neT/B) with constant redundancy
- Tight in indexability model ASV 99
8Previous Results External Memory
- Point enclosure
- O( ) for bounding-volume
hierarchies ABGHH 01 - Easy to get a O(n) space, O(log2nT/B) query
structure
Problem Internal memory External memory
1D range (N, log N T) (n, logBn T/B)
1D point enclosure (N, log N T) (n, logBn T/B)
2D range (N, NeT) (n, neT/B)
2D point enclosure (N, log N T)
(n, log nT/B) (nBe, logBnT/B)
(n, log n T/B)?
B
2
9Indexability Model in Details
- N data objects laid out in disk blocks, possibly
with redundancy - Each block holds at most B objects
- Cost of a query q minimum blocks needed to
retrieve all answers - Can find those blocks without cost
- Redundancy r and access overhead A
- r Average copies in the index
- Size is rn blocks
- A Ratio of the query cost to the ideal cost
in the worst case - Any query can be covered by
blocks (A B) - Lower bound expressed as a tradeoff between r and
A - 2D range searching
ASV 99
10Previous Results in Indexability Model
- Set queries HKP 97
- A set S of N objects, queries can be any subset
of S - For any rn/B, AB
- Trivial
- Range searching
- HKP 97
-
- SP 98
- Only tight for the special case when points form
a grid - ASV 99
11Redundancy Theorem SP 98
- (Asymptotic version)
- For N data objects, if there exist m queries q1,
, qm, such that for any 1 i,j m, i ? j, - qi B,
- qinqj B/A2,
- then, we have the redundancy
- Combinatorial in nature
- Used successfully to obtain the range searching
lower bound
12Point Enclosure Lower Bound Construction (1)
- Set of queries the Fibonacci lattice (one of
low-discrepancy point sets) - m points in a mm grid
- Only property used any rectangle with area am
contains between and
points - Set of objects
- Tiling rectangles of atim/ti
- t(m/a)1/B, i1,,B
- maN/B
- T(Bm2/(am)) T(N)rectangles are constructed
- qi B is satisfied
-
13Point Enclosure Lower Bound Construction (2)
- Any A that satisfies qinqj B/A2 will become a
lower bound - Make A as large as possible
- For a rectangle to cover q1 and q2, we must have
atix and m/tiy, or x/a ti m/y - q1 and q2 are two points from the Fibonacci
lattice, so xyc2m - such rectangles
14Point Enclosure Lower Bound Construction (3)
- Disprove earlier (n, logBnT/B) conjecture
- Still a square root factor away
- Whats wrong? The construction technique, or the
model itself?
15Refine the Indexability Model
- O(logBn q/B)
- Search cost Retrieval cost
- Observation retrieval cost is relatively high
for small queries - Refine add an addictive factor!
- Old any query q is covered by
blocks - New Any query q is covered by
blocks - Modify the Redundancy Theorem accordingly
- The two conditions
qi B, qinqj B/A2
qi BA0, qinqj B/A12
16The Refined Redundancy Theorem
- For N data objects, if there exist m queries q1,
, qm, such that for any 1 i,j m, i ? j, - qi BA0,
- qinqj B/(2A1)2,
- then, we have the redundancy
- Proof Sketch
- Each query can be covered by
blocks, - and apply the original Redundancy Theorem with
A2A1
17Fix the Construction
- Old construction
- q B
- B layers of tiling rectangles
- Size of Fibonacci latticemaN/B
- Total rectangles N
- New construction
- q BA0
- BA0 layers of tiling rectangles
- Size of Fibonacci latticemaN/(BA0)
- Total rectangles N
18Range Searching vs. Point Enclosure
- Range searching
- Original model
- New model
- Point enclosure
- Dual bounds in external memory!
r
19Matching Upper Bounds (1)
- In the external pointer machine model
- Only interested in the case A1O(1)
- Goal for any r B, design an index with
redundancy r that answers query in O(logrnT/B)
I/Os - Building block one-sided segment intersection
queries - Given N horizontal segments
- Report all segment directly above a query point
- Persistent B-tree (modified)
- O(n) space, O(logBnT/B) query
- Search on the x-coordinate ofthe query point
- Retrieve the segments
20Matching Upper Bounds (2)
- Divide plane into r horizontal slabs
- Associate two one-sided segmentintersection
structures to each slab - One for all top sides of rectanglesthat cross
its bottom boundary - One for all bottom sides ofrectangles that cross
its top boundary and all bottom sidesof
rectangles that completely span the slab - Recursively handle rectangles that fall
completely within a slab, resulted in a tree with
fanout r - Any rectangle is stored at most r times
redundancy is r - Query follow the tree top-down, ask two
one-sided queries at each level. O(logrn
logBNT/B) I/Os ? O(logrnT/B) by fractional
cascading
21Conclusions
- A tight lower bound on the tradeoff between the
redundancy and access overhead of any index for
the 2D point enclosure queries, given in the new
indexability model - A matching upper bound in the external pointer
machine - The END