Title: Computing Diameter in the Streaming and Sliding-Window Models
1Computing Diameter in the Streaming and
Sliding-Window Models
- J. Feigenbaum, S. Kannan, J. Zhang
2Introduction
- Two computational models
- Streaming model
- Sliding-window model
- The problem diameter of a point set P in R2. The
diameter is the maximum pairwise distance between
points in P.
3More about Models
- The streaming model
- A data stream is a sequence of data elements a1
a2 , ..., am . - A streaming algorithm is an algorithm that
computes some function over a data stream and has
the following properties - The input data are accessed in a sequential
order. - The order of the data elements in the stream is
not controlled by the algorithm - The length of the stream, m, is huge. Only
space-efficient algorithms (sublinear or even
polylog(m)) are considered.
4Dynamic Algorithm in Computational Geometry
- Dynamic means that the set of objects under
consideration may change. There could be
additions and deletions to the point set P. - Maintain the current set of geometry objects in
certain data structures. Efficient updating and
query answering are emphasized. - May use linear space - different from the
requirement of the streaming and the
sliding-window models.
5More about Models (Continued)
- The sliding-window model
- The input is still a stream of data elements.
- A data element arrives at each time instant it
later expires after a number of time stamps equal
to the window size n - The current window at any time instant is the set
of data elements that have not yet expired.
6Computing Diameter in the Streaming Model
- A well-known diameter-approximation is streaming
in nature. - Project the points onto lines.
- Requires ? such that
- p(p)p(q) pq cos? (1- ?2/2)pq
(1-e)pq - The algorithm goes through the input once. It
needs storage for O(1/ ) points. To process
each point, it performs O(1/ ) projections.
7Diameter Approximation in the Streaming Model
- Theorem 1 There is a streaming e-approximation
algorithm for diameter that needs storage for
O(1/e) points and processes each point in
O(log(1/e)) time. - Take the first point of the stream as the
center and divide the space into sectors of
angle ? e/2(1-e). - For each sector, keep the point furthest from the
center in that sector.
8Diameter Approximation in the Streaming Model
- Let H be the maximum distance between the
center and any other point and Ti,j be the
minimal distance between the boundary arcs of
sector i (bb') and sector j (aa'). Approximate
the diameter with maxH, maxi,j Tij
9Maintaining Diameter in the Sliding-Window Model
- Our space efficient mehtod maintains the diameter
for sliding windows when the set of points P can
be bounded in a box that is not too large. - Let R be the maximum, over all windows, the ratio
of the diameter over the minimal non-zero
distance between any two points in that window. - That the bounding space is not too large means
R lt 2n.
10Maintaining Diameter in the Sliding-Window Model
- Theorem 2 There is an e-approximation
algorithm that maintains the diameter for a
planar point set in the sliding-window model
using - Poly(1/e, log n, log R) bits of space.
11Remove Irrelevant Points
- Consider maintaining the diameter in 1-d.
- A point will never realize any diameter if it is
spatially located between two newer points. - Remove these points. The locations of the
remaining points would look like - (where a1 is newer than a2 which is newer than
a3...) - The newer points would be located inside and
the older points would be located outside
12The Rounding Method
- Take the newest point as the center, and
round down other points. - Divide the line into the following intervals such
that cti ( 1e? )id for some distance d (to
be specified later). - Round all points in the interval ti, ti1) down
to ti. - In what follows we call the set of pints after
rounding a cluster. If 2i original points are
grouped into a cluster, we say the cluster is at
level i.
13Number of Points in a Cluster
- If multiple points are rounded to the same
location, we can discard the older ones and only
keep the newest one. - In each interval, we have only one point. Let D
be the diameter, the number of points k in a
cluster is bounded by - k log1e? D/d (log D/d)/log (1e?) (2/e?
)log D/d
14When Window Starts Sliding
- Need to consider addition and deletion.
- Deletion is easy, because the oldest point must
be one of the cluster's extreme points. - Addition is complicated, because we may need to
update the cluster center for each point that
arrives. - Our solution keep multiple clusters.
15Multiple Clusters in a Window
- We allow at most two clusters to be at each
level. - When the number of clusters of level i exceeds
2, merge the oldest twe clusters to form a
cluster at level i1. - The window can thus be divided into clusters.
16Clusters in a Window
17Merge Clusters
- Cluster c1cluster c2 cluster c3
- Make Ctr2 the center of cluster c3
18Merge Clusters (Continued)
- Discard the points in c1 that are located between
the centers of c1 and c2. - If point p in c1 satisfies pCtr1
(1e?)Ctr1Ctr2, discard it, too.
19Merge Clusters (Continued)
- Round the points in c2 and those remaining in c1
after the previous two steps using the center
Ctr2. - The value for d is lower bounded by e?
Ctr1Ctr2. The number of points in a cluster is
then bounded by - (2/e? )(log R log 1/e? )
20The Algorithm in 1-d
- Update when a new point arrives,
- Check the age of the boundary points of the
oldest cluster. If one of them has expired,
remove it. - Make the newly arrived point a cluster of size 1.
Go through the clusters and merge clusters
whenever necessary according to the rules stated
above. - While going throught the clusters, update the
boundary points of any cluster changed. - Update the window boundary points if necessary.
- Query Answer Report the distance between the
window boundary points as the window diameter.
21Space Requirement
- Let diamp be a diameter realized by point p. Each
time we do rounding, we introduce a
displacement for p at most e? diamp. Also p can
be rounded at most log n times. - Choose e? to be at most e/(2log n) to bound the
error. - There are at most 2log n clusters and in each
cluster at most O(1/e log n (log R log log n
log 1/e )) points. Keeping the age may require
log n space for each point. The total space
required is - O(1/e log3n (log R log log n log 1/e ))
22Time Complexity
- Query answer time is O(1).
- Worst case update time is O(1/e log2n (log R
log log n log 1/e )) because we may have
cascading merges. - The amortized update time is O(log n)
23Extend the Algorithm to 2-d
- We will have a set of lines l0, l1, ... and
project the points in the plane onto the lines. - Guarantee that any paire of points will be
projected to a line with angle f such that 1- cos
f e/2 - Use the diameter-maintenance algorithm in 1-d for
each line. - Everything will have a multiplicative overhead of
- O(1/ ).
24Lower Bound for Maintaining Exact Diameter
- Theorem 3 To maintain the exact diameter in a
sliding window model requires O(n) bits of space. - Consider 2n points a1, a2, ..., a2n with the
following properties - an1, an2, ..., a2n are located at coordinate
zero. - a1an a2an1 a3an2 ... an-1a2n-2
1 - The coordinates of the points aj for j 1,2,...,
n-2 have the form nk for some k 1,2,..., n.
25A Family of Point Sequences
We show below two sequences in the family
an an1 an2 ......
an-1
an-2
a2 a1
......
26Lower Bound for Maintaining Exact Diameter
(Countinued)
- There are at least
different sequences of 2n points satisfying the
above properties. - Need O(n) space to distinguish them.
- (Note here R n2 ltlt 2n)