Title: BATON Papers Overview
1BATON Papers Overview
- Jagadish, Ooi, and Vu BATON A Balanced Tree
Struture for P2P Networks, VLDB 2005. - Jagadish et al., Speeding up Search in P2P
Networks with a Multi-way Tree Structure, SIGMOD
2006 - Presented by Nick Taylor, October 4, 2006
2Overlay Networks
- Add additional routing protocol on top of network
routing (i.e. TCP/IP) - Need to be resilient to node failures,
intermittent connectivity - Redundancy in data and metadata
- Most popular is Distributed Hash Table (DHT)
- Pastry
- Chord
3Chord Overview
- Randomly assign each peer an ID between 0 and 2m
- Hash key to find ID for an item
- Next peer after ID (mod 2m)is responsible for
storing item - Each peer has routing table to subsequent peers
- Table is used for incremental routing
4Chord Routing Example (m 6)
Figures poached from Stoica et al., Chord A
Scalable Peer-to-peer Lookup Service for Internet
Applications, SIGCOMM 2001.
5DHTs The Good, the Bad and the Ugly
- Good
- Efficient routing algorithms
- Easy to understand
- Simple replication methods
- Bad
- Load balancing depends on good choice of hash
function - Ugly
- Can only do exact ID lookups, since items are
clustered in key space - Would like to do range and prefix searches
6BAlanced Tree Overlay Network
Note that routing table for a node m contains
details about nodes m-22, m-21, m-20, m20, m21,
sort of like Chord.
7Lookup Algorithm
Lookup for 74 starting at node h
8Lookup Algorithm (2)
- Rarely moves up tree
- If at leaf node
- If value is higher up tree
- At most log N messages sent to do a lookup
- Range query finds start of range using lookup
procedure, then follows links between nodes - Insertion and deletion use lookup procedure
9Node addition and removal rules
- Need to preserve balanced structure of tree
- Addition forward request to node with full
routing table but at most one child - Removal find leaf node that can be moved without
causing imbalance
10Load Balancing
a
- May need to redistribute ranges between nodes
- Simple case just affects immediate neighbors
- General algorithm
- Identify overloaded node g
- Identify lightly loaded node f
- Transfer fs data to c
- Add f as child of g
- If tree is unbalanced, perform rotation
- Need to update routing table
- Still cheaper than moving data around
c
b
f
d
e
g
e
c
d
a
g
b
f
11Experimental Results
- Maintenance slightly better than Chord
- Performance slightly worse than Chord
- Root node does not get overloaded
- Leaves get slightly more traffic than root
12BATON
- Generalizes techniques from binary trees to m-ary
trees - Improves query performance
- Increases maintenance cost
13Fault tolerance and load balancing
- m-ary trees are harder to partition
- m-ary trees have more leaf nodes
- Easier to do local load balancing
14Conclusions
- DHTs and BATON each have their strengths
- DHTs simplicity, slightly better performance,
easy replication - BATON lower maintenance overhead, range queries
- m-ary trees are often better than binary trees
- Depends on frequencies of queries and updates