Probabilistic Location and Routing - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Probabilistic Location and Routing

Description:

Probabilistic Location and Routing. INFOCOM 2002. Sean C. Rhea, John Kubiatowicz. Outline ... As the replica approaches the location of the query source, the ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 39
Provided by: dsl7
Category:

less

Transcript and Presenter's Notes

Title: Probabilistic Location and Routing


1
Probabilistic Location and Routing
  • INFOCOM 2002
  • Sean C. Rhea, John Kubiatowicz

2
Outline
  • Introduce
  • Algorithm Description
  • Experimental Setup
  • Results
  • Future Work
  • Related Work
  • Conclusion

3
Introduction
  • Two important challenges
  • How should we locate replicas
  • How should we route queries to replicas
  • Location-independent routing techniques
  • CAN, Chord, Pastry, and Tapestry
  • Location and routing operation require O(logN)

4
Introduction (cont.)
  • As the replica approaches the location of the
    query source, the performance of the existing
    algorithms quickly diverges from optimality
  • Divergence
  • A small amount of mis-routing in the local
    area can lead to a large divergence from
    optimality, since the optimal path is short to
    begin with

5
Introduction (cont.)
  • Our probabilistic location and routing alg. Is
    based on attenuated Bloom filters
  • It is decentralized
  • It is locality aware
  • It follows a minimal search path
  • It uses constant storage per server
  • Attenuated Bloom filters allow us to achieve
  • Quickly finding nearby replicas when they exist
  • Finding every document even when replicas are
    scare

6
Introduction (cont.)
7
Algorithm Description
  • Bloom Filters
  • bit-vector of length w
  • N different hash function
  • False positive
  • Width, the number of hash function, cardinality
    of the represented set

8
Bloom filter
9
Attenuated Bloom filter
  • Attenuated Bloom filters of depth d is an array
    of d normal Bloom filters
  • We associate each neighbor link with an
    attenuated Bloom filter
  • The first filter in the array summarizes
    documents available from that neighbor
  • The ith Bloom filter is the merger of all Bloom
    filter for all of the nodes a distance I through
    any path starting with that neighbor link, where
    distance is in terms of hops in the overlay
    network

10
Attenuated Bloom filter
11
The Query Algorithm
  • To perform a location query, the querying node
    examines the 1st level of each of its neighbors
    attenuated Bloom filters
  • If one of the filters matches, it is likely that
    the desired data item is only one hop away, and
    the query is forwarded to the matching neighbor
    closest to the current node in network latency
  • If no filter matches, the querying node looks for
    a match in the 2nd level of every filter

12
The Query Algorithm (cont.)
  • As before, if a match is found, the query Is
    forwarded to the matching neighbor of lowest
    latency.
  • This time, however, it is not the immediate
    neighbor who is likely to possess the data item,
    but one of its neighbors.
  • This next neighbor is determined as before, by
    examining the attenuated Bloom filters of the
    current server

13
The Query Algorithm (cont.)
  • False positive
  • Forward the request to the deterministic alg.
  • The query can be returned to the previous server
    in the query path (DFS)
  • Each query in the system contains a list of all
    the servers that it has already visited

14
The Update Algorithm
  • Every server in the system stores both an
    attenuated Bloom filter for each outgoing link
    (e.g. FAB in fig.3), and a copy of its neighbors
    view of the reverse direction
  • The server calculates the changed bits in its own
    filter and in each of the filters its neighbors
    maintain
  • It then send out these bits out to each neighbor

15
The Update Algorithm (cont.)
  • On receiving such a message, each neighbor
    attenuates the bits one level and computes the
    changes they will make in each of its own
    neighbors filter.
  • These changes are sent out as well
  • Etc

16
The Update Algorithm (cont.)
  • One problem with this algorithm
  • The update will be propagated to some servers
    more than once (fig 3 a document was added to
    Node D)
  • false positive rate
  • Two distinct update filtering algorithm
  • Destination filtering
  • Source filtering
  • When a deletion causes bits at any level of a
    Bloom filter to transform from one to zero, we
    must be careful to propagate this deletion to all
    appropriate nodes

17
Experimental Setup
  • We simulated it in conjunction with two different
    deterministic algorithms
  • Home-node location
  • a home-node server that keeps a set of pointers
    to every replica of the document
  • Tapestry
  • Assumption that every server and document in the
    system can be named with a unique,
    location-independent identifier

18
Tapestry (cont.)
  • Node-IDs for the node names and globally unique
    identifiers (GUIDs) for the documents
  • Two major components
  • A routing mesh
  • A distributed directory service

19
Tapestry Routing Mesh
20
Publication in Tapestry
21
Location in Tapestry
22
Simulation Environment
23
Simulation Environment (cont.)
  • All stub to stub edges are 100Mb/s
  • All stub to transit edges are 1.5Mb/s
  • All transit to transit edges are 45Mb/s
  • In our experiments, we focus on stub to transit
    domain bandwidth, since these inter-domain edges
    are the most bandwidth constrained in the system

24
Experiment Descriptions
  • Static experiments
  • Dynamic experiments
  • Based on whether the set of replicas in the
    system changes during the test

25
Resultsprobabilistic update algorithm
26
Static Experiments
27
Static Experiments (cont.)
28
Static Experiments (cont.)
29
Static Experiments (cont.)
30
Static Experiments (cont.)
31
Static Experiments (cont.)
32
Dynamic Experiments
33
Dynamic Experiments (cont.)
34
Dynamic Experiments (cont.)
35
Future Work
  • The design of algorithms to adhere to such
    restrictions while producing an overlay network
    in a self-organizing manner is thus an important
    component of our future work
  • Since an update to a cache causes Tapestry to
    send only O(logN) message, whereas the
    probabilistic algorithm must send some amount of
    information to every server in its filters
    range, using these more advanced algorithms
    should only improve the bandwidth consumption of
    the probabilistic algorithm relative to Tapestry

36
Related Work
  • Bloom filters have long been used as a lossy
    summary technique. First to combine them into a
    compound, topology-aware data structure
  • In 20, Bloom filters were used to improve the
    efficiency of distributed join operations by
    filtering elements without consuming network
    bandwidth.
  • In 21, Aoki used Bloom filters to guide
    searches through generalized search trees

37
Related Work (cont.)
  • Both the Summary Cache and Cache Digests use
    Bloom filters to summarize the contents of a set
    of cooperating web caches
  • The Secure Discovery Service (SDS) uses Bloom
    filters to route queries to appropriate services,
    such as printers or scaners

38
Conclusion
  • The algorithm is based on a new data structure we
    call an attenuated Bloom filter
  • Furthermore, we have shown that our algorithm may
    be combined with a deterministic algorithm
  • Finally, we have demonstrated that
Write a Comment
User Comments (0)
About PowerShow.com