A Case Study in Building Layered DHT Applications - PowerPoint PPT Presentation

About This Presentation
Title:

A Case Study in Building Layered DHT Applications

Description:

Distributed systems are designed to be scalable, available and robust ... Layering allows inheriting of robustness, availability and scalable routing from ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 18
Provided by: yati7
Category:

less

Transcript and Presenter's Notes

Title: A Case Study in Building Layered DHT Applications


1
A Case Study in Building Layered DHT Applications
  • Yatin Chawathe
  • Sriram Ramabhadran, Sylvia Ratnasamy, Anthony
    LaMarca, Scott Shenker, Joseph Hellerstein

2
Building distributed applications
  • Distributed systems are designed to be scalable,
    available and robust
  • What about simplicity of implementation and
    deployment?
  • DHTs proposed as simplifying building block
  • Simple hash-table API put, get, remove
  • Scalable content-based routing, fault tolerance
    and replication

3
Can DHTs help
  • Can we layer complex functionality on top of
    unmodified DHTs?
  • Can we outsource the entire DHT operation to a
    third-party DHT service, e.g., OpenDHT?
  • Existing DHT applications fall into two classes
  • Simple unmodified DHT for rendezvous or storage,
    e.g., i3, CFS, FOOD
  • Complex apps that modify the DHT for enhanced
    functionality, e.g, Mercury, CoralCDN

4
Outline
  • Motivation
  • A case study Place Lab
  • Range queries with Prefix Hash Trees
  • Evaluation
  • Conclusion

5
A Case Study Place Lab
  • Positioning service for location-enhanced apps
  • Clients locate themselves by listening for known
    radio beacons (e.g. WiFi APs)
  • Database of APs and their known locations

Place Lab service Computes maps of AP MAC
address ? lat,lon
lat, lon ? list of APs . . .
AP ? lat, lon . . .
6
Why Place Lab
  • Developed by group of ubicomp researchers
  • Not experts in system design and management
  • Centralized deployment since March 2004
  • Software downloaded by over 6000 sites
  • Concerns over organizational control ?
    decentralize the service
  • But, want to avoid implementation and deployment
    overhead of distributed service

7
How DHTs can help Place Lab
DHT storage and routing
Clients download local WiFi maps

War-drivers submit neighborhood logs
Place Lab servers compute AP location
  • Automatic content-based routing
  • Route logs by AP MAC address to appropriate Place
    Lab server
  • Robustness and availability
  • DHT managed entirely by third party
  • Provides automatic replication and failure
    recovery of database content

8
Downloading WiFi Maps
?
DHT storage and routing
Clients download local WiFi maps

War-drivers submit neighborhood logs
Place Lab servers compute AP location
  • Clients perform geographic range queries
  • Download segments of the databasee.g., all
    access points in Philadelphia
  • Can we perform this entirely on top of unmodified
    third-party DHT
  • DHTs provide exact-match queries, not range
    queries

9
Supporting range queries
  • Prefix Hash Trees
  • Index built entirely with put, get,
    removeprimitives
  • No changes to DHT topology or routing
  • Binary tree structure
  • Node label is a binary prefix of values stored
    under it
  • Nodes split when they get too big
  • Stored in DHT with node label as key
  • Allows for direct access to interior and leaf
    nodes

R
R0
R1
R11
R10
R01
R00
0 0000
3 0011
8 1000
R111
R110
R011
R010
6 0110
12 1100
14 1110
4 0100
5 0101
13 1101
15 1111
10
PHT operations
  • Lookup(K)
  • Find leaf node whose label is prefix of K
  • Binary search across Ks bits
  • O(log log D) where D size of key space
  • Insert(K, V)
  • Lookup leaf node for K
  • If full, split node into two
  • Put value V into leaf node
  • Query(K1, K2)
  • Lookup node for P, where Plongest common prefix
    of K1,K2
  • Traverse subtree rooted at node for P

R
R0
R1
R11
R10
R01
R00
0 0000
3 0011
8 1000
R111
R110
R011
R010
6 0110
12 1100
14 1110
4 0100
5 0101
13 1101
15 1111
11
2-D geographic queries

7
  • Convert lat/lon into 1-D key
  • Use z-curve linearization
  • Interleave lat/lon bits to create z-curve key
  • Linearized query results may not be contiguous
  • Start at longest prefix subtree
  • Visit child nodes only if they can contribute to
    query result

6
5
4
latitude
3
2
1
0

0
1
2
3
4
5
6
7
longitude
P10
12
PHT Visualization
13
Ease of implementation and deployment
  • 2,100 lines of code to hook Place Lab into
    underlying DHT service
  • Compare with 14,000 lines for the DHT
  • Runs entirely on top of deployed OpenDHT service
  • DHT handles fault tolerance and robustness, and
    masks failures of Place Lab servers

14
Flexibility of DHT APIs
  • Range queries use only the get operation
  • Updates use combination of put, get, remove
  • But
  • Concurrent updates can cause inefficiencies
  • No support for concurrency in existing DHT APIs
  • A test-and-set extension can be beneficial to
    PHTs and a range of other applications
  • put_conditional perform the put only if value
    has not changed since previous get

15
PHT insert performance
  • Median insert latency is 1.45 sec
  • w/o caching 3.25 sec with caching 0.76 sec

16
PHT query performance
Data size Latency (sec)
5k 2.13
10k 2.76
50k 3.18
100k 3.75
  • Queries on average take 24 seconds
  • Varies with block size
  • Smaller (or very large) block size implies longer
    query time

17
Conclusion
  • Concrete example of building complex applications
    on top of vanilla DHT service
  • DHT provides ease of implementation and
    deployment
  • Layering allows inheriting of robustness,
    availability and scalable routing from DHT
  • Sacrifices performance in return
Write a Comment
User Comments (0)
About PowerShow.com