Probabilistic Location and Routing presentation

About This Presentation

Transcript and Presenter's Notes

Title: Probabilistic Location and Routing

1
Probabilistic Location and Routing

INFOCOM 2002
Sean C. Rhea, John Kubiatowicz

2
Outline

Introduce
Algorithm Description
Experimental Setup
Results
Future Work
Related Work
Conclusion

3
Introduction

Two important challenges
How should we locate replicas
How should we route queries to replicas
Location-independent routing techniques
CAN, Chord, Pastry, and Tapestry
Location and routing operation require O(logN)

4
Introduction (cont.)

As the replica approaches the location of the
query source, the performance of the existing
algorithms quickly diverges from optimality
Divergence
A small amount of mis-routing in the local
area can lead to a large divergence from
optimality, since the optimal path is short to
begin with

5
Introduction (cont.)

Our probabilistic location and routing alg. Is
based on attenuated Bloom filters
It is decentralized
It is locality aware
It follows a minimal search path
It uses constant storage per server
Attenuated Bloom filters allow us to achieve
Quickly finding nearby replicas when they exist
Finding every document even when replicas are
scare

6
Introduction (cont.)
7
Algorithm Description

Bloom Filters
bit-vector of length w
N different hash function
False positive
Width, the number of hash function, cardinality
of the represented set

8
Bloom filter
9
Attenuated Bloom filter

Attenuated Bloom filters of depth d is an array
of d normal Bloom filters
We associate each neighbor link with an
attenuated Bloom filter
The first filter in the array summarizes
documents available from that neighbor
The ith Bloom filter is the merger of all Bloom
filter for all of the nodes a distance I through
any path starting with that neighbor link, where
distance is in terms of hops in the overlay
network

10
Attenuated Bloom filter
11
The Query Algorithm

To perform a location query, the querying node
examines the 1st level of each of its neighbors
attenuated Bloom filters
If one of the filters matches, it is likely that
the desired data item is only one hop away, and
the query is forwarded to the matching neighbor
closest to the current node in network latency
If no filter matches, the querying node looks for
a match in the 2nd level of every filter

12
The Query Algorithm (cont.)

As before, if a match is found, the query Is
forwarded to the matching neighbor of lowest
latency.
This time, however, it is not the immediate
neighbor who is likely to possess the data item,
but one of its neighbors.
This next neighbor is determined as before, by
examining the attenuated Bloom filters of the
current server

13
The Query Algorithm (cont.)

False positive
Forward the request to the deterministic alg.
The query can be returned to the previous server
in the query path (DFS)
Each query in the system contains a list of all
the servers that it has already visited

14
The Update Algorithm

Every server in the system stores both an
attenuated Bloom filter for each outgoing link
(e.g. FAB in fig.3), and a copy of its neighbors
view of the reverse direction
The server calculates the changed bits in its own
filter and in each of the filters its neighbors
maintain
It then send out these bits out to each neighbor

15
The Update Algorithm (cont.)

On receiving such a message, each neighbor
attenuates the bits one level and computes the
changes they will make in each of its own
neighbors filter.
These changes are sent out as well
Etc

16
The Update Algorithm (cont.)

One problem with this algorithm
The update will be propagated to some servers
more than once (fig 3 a document was added to
Node D)
false positive rate
Two distinct update filtering algorithm
Destination filtering
Source filtering
When a deletion causes bits at any level of a
Bloom filter to transform from one to zero, we
must be careful to propagate this deletion to all
appropriate nodes

17
Experimental Setup

We simulated it in conjunction with two different
deterministic algorithms
Home-node location
a home-node server that keeps a set of pointers
to every replica of the document
Tapestry
Assumption that every server and document in the
system can be named with a unique,
location-independent identifier

18
Tapestry (cont.)

Node-IDs for the node names and globally unique
identifiers (GUIDs) for the documents
Two major components
A routing mesh
A distributed directory service

19
Tapestry Routing Mesh
20
Publication in Tapestry
21
Location in Tapestry
22
Simulation Environment
23
Simulation Environment (cont.)

All stub to stub edges are 100Mb/s
All stub to transit edges are 1.5Mb/s
All transit to transit edges are 45Mb/s
In our experiments, we focus on stub to transit
domain bandwidth, since these inter-domain edges
are the most bandwidth constrained in the system

24
Experiment Descriptions

Static experiments
Dynamic experiments
Based on whether the set of replicas in the
system changes during the test

25
Resultsprobabilistic update algorithm
26
Static Experiments
27
Static Experiments (cont.)
28
Static Experiments (cont.)
29
Static Experiments (cont.)
30
Static Experiments (cont.)
31
Static Experiments (cont.)
32
Dynamic Experiments
33
Dynamic Experiments (cont.)
34
Dynamic Experiments (cont.)
35
Future Work

The design of algorithms to adhere to such
restrictions while producing an overlay network
in a self-organizing manner is thus an important
component of our future work
Since an update to a cache causes Tapestry to
send only O(logN) message, whereas the
probabilistic algorithm must send some amount of
information to every server in its filters
range, using these more advanced algorithms
should only improve the bandwidth consumption of
the probabilistic algorithm relative to Tapestry

36
Related Work

Bloom filters have long been used as a lossy
summary technique. First to combine them into a
compound, topology-aware data structure
In 20, Bloom filters were used to improve the
efficiency of distributed join operations by
filtering elements without consuming network
bandwidth.
In 21, Aoki used Bloom filters to guide
searches through generalized search trees

37
Related Work (cont.)

Both the Summary Cache and Cache Digests use
Bloom filters to summarize the contents of a set
of cooperating web caches
The Secure Discovery Service (SDS) uses Bloom
filters to route queries to appropriate services,
such as printers or scaners

38
Conclusion

The algorithm is based on a new data structure we
call an attenuated Bloom filter
Furthermore, we have shown that our algorithm may
be combined with a deterministic algorithm
Finally, we have demonstrated that

Write a Comment

User Comments (0)

About PowerShow.com

Probabilistic Location and Routing PowerPoint PPT Presentation