SKIP GRAPHS - PowerPoint PPT Presentation

About This Presentation

Title:

SKIP GRAPHS

Description:

Store resources identified by keys. Peers subject to crash failures. ... Incorporating geography. Incorporating locality [temporal, spatial] 4. Distributed Hash Tables ... – PowerPoint PPT presentation

Number of Views:81

Avg rating:3.0/5.0

Slides: 37

Provided by: gauri9

Learn more at: http://homepage.divms.uiowa.edu

Category:

more less

Transcript and Presenter's Notes

Title: SKIP GRAPHS

1
SKIP GRAPHS
James Aspnes Gauri Shah SODA 2003
2
P2P system
Key

Bunch of peers.
Store resources identified by keys.
Peers subject to crash failures.
Goal locate resources efficiently.

3
Properties of ideal network

Data availability
Decentralization
Fault-tolerance
Scalability
Load balancing
Maintaining the network
Dynamic node addition/deletion
Self-stabilization
Efficient searching
Incorporating geography
Incorporating locality temporal, spatial

4
Distributed Hash Tables
Virtual Route
v4
Nodes
Keys
v2
v1
HASH
Physical Link
v3
Virtual Link
v1 v2 v3 v4
Actual Route
PHYSICAL NETWORK
VIRTUAL OVERLAY NETWORK
5
Advantages
Disadvantages

Load balancing.
Decentralization.
O(log n) space and search time.
O(log2n) insert and delete time search for (log
n) neighbors.
Tolerance of random faults.

No locality properties.
No tolerance to adversarial faults.
No self-stabilization.
No optimization wrt. geography.

SKIP GRAPHS
6
Skip List Pugh 90
Data structure based on a linked list.
TAIL
HEAD
Level 0
A
G
J
M
R
W
Each node linked at higher level with probability
1/2.
7
Searching in a skip list
Search for key R
HEAD
TAIL
success
failure
J
Level 2
A
J
M
Level 1
Level 0
A
G
J
M
R
W
-?
?
Time for search O(log n) on average. On average,
constant number of pointers per node.
8
Skip lists for P2P?
Advantages

O(log n) expected search time.
Retains locality.
Dynamic node additions/deletions.

Disadvantages

Heavily loaded top-level nodes.
Easily susceptible to random failures.
Lacks redundancy.

9
A Skip Graph
W
G
Level 2
R
A
J
M
101
100
000
001
011
110
G
100
W
R
Level 1
A
J
M
101
110
001
001
011
Membership vectors
A
J
M
R
W
Level 0
G
001
001
011
100
110
101
Link at level i to nodes with matching prefix of
length i. Think of a tree of skip lists that
share lower layers.
10
Properties of skip graphs

Searching.
Node insertions.
Independence from system size.
Locality and range queries.

11
Searching avg. O (log n)
Restricting to the lists containing the starting
element of the search, we get a skip list.
G
W
Level 2
R
A
J
M
G
W
R
Level 1
A
J
M
Level 0
A
G
J
M
R
W
Same performance as DHTs.
12
Node Insertion 1
buddy
new node
W
G
Level 2
R
M
A
101
100
000
011
110
R
G
W
Level 1
A
M
101
110
100
001
011
A
R
G
M
W
Level 0
001
011
100
110
101
Starting at buddy node, find nearest key at level
0. Basically a range query looking for key
closest to new key. Takes O(log n) on average.
13
Node Insertion - 2
At each level i, find nearest node with matching
prefix of membership vector of length i1.
W
G
Level 2
R
M
A
101
100
000
011
110
G
W
R
Level 1
A
M
101
110
100
001
011
A
R
G
M
W
Level 0
001
011
100
110
101
Total time for insertion O(log n) DHTs take
O(log2n)
14
Independent of system size
No need to know size of keyspace or number of
nodes.
E
Z
Level 1
Z
E
Level 0
1
0
Old nodes extend membership vector as required
with arrivals. DHTs require knowledge of keyspace
size initially.
15
Locality and range queries

Find key lt F, gt F.
Find largest key lt x.
Find least key gt x.
Find all keys in interval D..O.
Initial node insertion at level 0.

D
F
A
I
D
F
A
I
O
S
L
16
Applications of locality
Version Control
e.g. find latest news from yesterday. find
largest key lt news10/29.
Level 0
news10/29
news10/27
news10/28
news10/26
news10/25
Data Replication
e.g. find any copy of some Britney Spears song.
Level 0
britney05
britney03
britney04
britney02
britney01
DHTs cannot do this easily as hashing destroys
locality.
17
So far...
?
Decentralization. Locality properties.
O(log n) space per node. O(log n) search,
insert, and delete time. Independent of
system size.
?
?
?
?
18
Load balancing
Interested in average load on a node u. i.e. the
number of searches from source s to
destination t that use node u.
Theorem Let dist (u, t) d. Then the
probability that a search from s to t passes
through u is lt 2/(d1).
where V nodes v u lt v lt t and V d1.
19
Skip list restriction
s
Level 2
Level 1
Level 0
Node u is on the search path from s to t only if
it is in the skip list formed from the lists of s
at each level.
20
Tallest nodes
s
u is not on path.
u is on path.
?
u
u
t
Node u is on the search path from s to t only if
it is in T the set of k tallest nodes in u..t.
Heights independent of position, so distances are
symmetric.
21
Load on node u
Start with n nodes. Each node goes to next set
with prob. 1/2. We want expected size of T last
non-empty set.
We show that ET lt 2.
Asymptotically ET 1/(ln 2) ? 2x10-5 ?
1.4427 Trie analysis
Average load on a node is inversely proportional
to the distance from the destination.
We also show that the distribution of average
load declines exponentially beyond this point.
22
Experimental result
Load on node
Node location
23
Fault tolerance
How do node failures affect skip graph
performance?
Random failures Randomly chosen nodes fail.
Experimental
results. Adversarial failures Adversary
carefully chooses
nodes that fail.
Bound on expansion ratio.

24
Random faults
131072 nodes
25
Searches with random failures
131072 nodes 10000 messages
26
Adversarial faults
dA nodes adjacent to A but not in
A. Expansion ratio min dA/A,
1 lt A lt n/2.
A
dA
Theorem A skip graph with n nodes has
expansion ratio (1/log n).
f failures can isolate only O(flog n ) nodes.
27
Proof intuition
Consider neighbors of set A at level 0.
A
Level 0
1. Clumpy sets
dA
Low probability of clumpy sets.
A
A
2. Non-clumpy sets
Level 0
Non-clumpy sets have many neighbors at level
0. Gives high expansion ratio.
28
Expansion ratio
All sets have low probability of few neighbors at
level h. And there are not too many clumpy
sets. Low probability that any set A has few
neighbors at level 0 or h. This gives
expansion ratio (1/log n).
Same analysis applicable to DHTs?
29
Need for repair mechanism
G
W
Level 2
R
A
J
M
W
R
G
Level 1
A
J
M
A
G
J
R
W
M
Level 0
Node failures can leave skip graph in
inconsistent state.
30
Ideal skip graph
Let xRi (xLi) be the right (left) neighbor of x
at level i.
If xLi, xRi exist
xLi lt x lt xRi. xLiRi xRiLi x.
Invariant
31
Basic repair
If a node detects a missing neighbor, it tries
to patch the link using other levels.
3
3
Also relink at other lower levels.
Successor constraints may be violated by node
arrivals or failures.
32
Constraint violation
Neighbor at level i not present at level (i-1).
Level i
x
x
Level i-1
..00..
..01..
..01..
..01..
..00..
..01..
..01..
..01..
33
Self-stabilization
zOp(B)
zOp(E)
zOp(I)
A
C
D
F
J
zipperOp message
Level i
B
E
G
H
I
zOp(D)
zOp(A)
zOp(F)
Eventually want each connected component of the
skip graph to reorganize itself into an ideal
skip graph.
34
Conclusions
Similarities with DHTs

Decentralization.
O(log n) space at each node.
O(log n) search time.
Load balancing properties.
Tolerant of random faults.

35
Differences
Property DHTs Skip Graphs
Insert/Delete time O(log2n) O(log n)
Locality No Yes
Repair mechanism ? Partial
Tolerance of adversarial faults ? Yes
Keyspace size Reqd. Not reqd.
36
Open Problems