Title: Building Low-maintenance Expressways for P2P Systems
1Building Low-maintenance Expressways for P2P
Systems
- Zhichen Xu and Zheng Zhang
- Presented By
- Prasetha Panampully
2Overview
- P2P systems can be improved in 3 areas
- Maintenance cost
- Ability to make discriminative use of nodes
- Ability to adapt to network conditions
- and application needs
- Here we try to boost the routing performance of
CAN using Expressways
3Agenda
- Brief overview of CAN
- Definition of an Expressway
- Building Expressways
- Routing with Expressways
- Load balance with Expressways
- Maintenance
- Analysis
- Tuning to best performance
- Tuning to network conditions
- Route around a congested node
- Tuning according to application needs
4P2P Systems
- Utilize distributed resources and perform
critical functions in a de-centralized manner - Advantages
- Cost sharing
- Autonomous
- Resource aggregation
- Improved scalability/reliability
- Increased autonomy
- Anonymity/privacy
- Dynamism
- Ad-hoc communication
5Content Addressable Network (CAN)
- Offers administration-free and fault-tolerant
storage utility. - CAN is a distributed infrastructure which
provides hash table like functionality. - It is scalable, fault-tolerant and completely
self-organizing. - CAN organizes the logical space as a
d-dimensional Cartesian coordinate space. - The co-ordinate space is partitioned among
existing nodes and each node owns its individual
zone.
6CAN Example
-
- Node n1(1,2) is first node that joins and it
covers the entire space. - Then n1 has the authority over all the files
present in this space - As we are considering only two dimension, it will
be only one copy of particular file in this
dimension
7
6
5
4
3
n1
2
1
0
2
3
4
5
6
7
0
1
7CAN Example
- Actions on Node Join
- When a new node joins, it joins a node that
is close to it in IP distance. - The existing node will split its allocated zone
into half. - The neighbors of the split zone must be
notified.
7
6
5
4
3
n1
n2
2
1
0
2
3
4
5
6
7
0
1
8CAN Example
- If one more node enters the space covered by 1
then the region of 1 is equally divided.
7
6
n3
5
4
3
n1
n2
2
1
0
9CAN Example
7
6
n5
- More example of
- the space division.
n4
n3
5
4
3
n1
n2
2
1
0
2
3
4
5
6
7
0
1
10CAN Example
7
- Here the f1, f2 etc are the ids of the
documents present in this particular network
space.
6
n5
n4
n3
5
f4
4
f1
3
n1
n2
2
f3
1
f2
0
2
3
4
5
6
7
0
1
11CAN Example
7
- Each item is stored by the node who owns its
mapping in the space
6
n5
n4
n3
5
f4
4
f1
3
n1
n2
2
f3
1
f2
0
2
3
4
5
6
7
0
1
12CAN Example
- Assume n1 queries for f4,it forwards the query to
n3 as it is the nearest neighbor. - n3 forwards it to n4 and n4 to n5.
- n5 sends back information to n1 that it has the
doc and the file exchange starts if n1 wishes to.
13Routing in CAN
- CAN node maintains a coordinate routing table
that holds the IP address and the virtual
coordinates of each of its neighboring zones. - A CAN message includes the destination
coordinates. Using its neighbor coordinate set, a
node routes a message towards its destination by
simply forwarding to the neighbor with
coordinates closest to the destination
coordinates.
14Routing in CAN
Consider a 2-d Cartesian space with 16 equal
zones. Each node has maximum of 4 neighbors. So,
maximum routing path length is 3 hops. For a
d-dimensional space which is partitioned into n
equal zones the average routing path length is
(d/4)(n(1/d)) hops.
15Routing in CAN
- No of neighbors for an individual node is 2d.
- Thus,for a d-dimensional space, we can grow the
number of nodes (zones) without increasing the
per node state. - The average routing path length grows in
proportion to (n(1/d)). - Thus the routing performance in CAN is
O(n(1/d)). - Expressways can improve the routing performance
to O(log N)
16Expressway
- Expressway is an auxiliary data structures to
deliver high routing performance. - Features
- Auxiliary mechanisms
- Self organized.
- Self maintained.
- Adaptive to changing network.
- High bandwidth.
-
17Overview of Expressways
- Expressways of CANs have routing tables of
increasing span. - The entire space is partitioned into zones of
different spans with smallest zones correspond to
the CAN zones and any other zones are called
Expressway zones. - Here CAN zones are at level 3, four of
neighboring CAN zones make one level 2 expressway
and four such level 2 zones make a level 1 zone.
18Overview of Expressways
- Each node owns a CAN zone and is also a resident
of the Expressway zone that encloses its CAN
zone. - Expressway zones and CAN zones are recorded in
each node in data structure called Total routing
table. Total routing table of node 1 consists of
the default routing table of CAN (plain arcs) and
expressway routing tables (thick and long arcs)
19Expressway Routing
- Total routing table RT lt R0, R1, R2, ..RLgt where
RL corresponds to the nodes default routing
table that CAN already builds. - Ri (i0 to L-1) is called Expressway routing
table that has larger span. - Smaller the i larger the span.
- For each neighboring expressway zone, the
expressway routing table keeps the address of one
or more nodes in that zone.
20Building an Expressway
- The algorithm for building expressway is
Evolving snapshot . - According to the algorithm, at regular intervals
of system growth, snapshots of current routing
table are taken. - Each node takes the snapshot independently by
observing its zone size, with which it may infer
to what stage the system has grown. - For node x, suppose the current zone size is
x.R.Z and the target size is x.R(L-1).Z/K. if the
xs current size shrinks to its target size then
it takes a new snapshot by incrementing L. where
K is the span of the expressway. - This frozen routing table is then pushed into xs
total routing table.
21Summary of Notations
22Expressway Building
- CAN can be thought as building a binary tree
since each new node splits a random existing
node. - When a node y splits with x, it inherits all
entries of xs total routing table other than xs - current routing table.
- Each newly added link of the tree leads to a new
node join - Total routing table can be found by walking down
the tree from the root towards the node picking
up the snapshot routing tables. - Tree rooted by x when it joined system is called
xs expressway tree.
23Procedure when a node y joins node x
-
- y.RT ltx.R0,x.R1,,x.RL-1,y.RLgt
- repeat the procedure for testing for a new
snap shot - Procedure for testing for new snapshot
-
- Both nodes test to see if its current zone
has shrunken to 1/K th of its last snapshot. - If (RL.Z lt RL-1.Z/K)
- RL1 RL
- RTltR0,R1,..,RL,RL1gt
- LL1
24Routing
Route with Expressway If(pt?RL.Z)
Return For(i0iltLi) If(pt?(Ri.Z))
Route using Ri Route with Ri
for(j0jltdj) if(ptltRi.Z.LjptgtRi.Z.Uj)
Route to x?Ri.Nj that is close to pt
Break
25Load Balance
- For achieving load balance . . .
- Node z that routes to x in expressway routing,
should have the option for replacing x with any
node inside the xs expressway tree. - Example
- Let Z.Ri be the routing table used to route to x
and X be the xs expressway zone, then we pick a
point pt in X and route to it. Whoever owns that
zone now replaces x in Z.Ri. Here the point pt is
selected randomly. - No of nodes that replace x is proportional to the
size of xs Expressway tree and therefore is
proportional to the no. of residents inside xs
Expressway tree. -
- Thus the algorithm will automatically balance
the load among Expressways.
26Default vs. Random Algorithm
This figure shows the simulation results
reporting number of messages each node has to
forward with default routing table and the
routing table constructed using random
selection. We see that with random selection,
more no of messages are forwarded
27Dynamic Maintenance
- Node Joins
- When a new node joins the network, it updates its
expressway neighbors for the expressway zones
recorded in its total routing table. Such
maintenance is proportional to the total levels
of expressways, which is O(logkN). - Node Departs
- When a node departs, one of its neighbors will
take over departed nodes responsibilities.
However the departed node can be the expressway
neighbor of multiple nodes and its expressway
functions need to be maintained by some other
nodes.
28Dynamic Maintenance
- Suppose node x departs from the network, and node
y attempts to route to node x, then the request
will timeout - Then node y picks up a point in the space of x
recorded in the failed routing table, and route
to it. This succeeds at node z whose zone
contains the point. z is now replacement of x
to repair ys snapshot. - On average the total no. of nodes that will
update their routing table entries - Total no. of expressway levels
- no. of neighbors in each level
- O(d.logkN).
29Storage
- The total routing table depth is m and as a
result the storage for routing table increases
m-folds. - mlogkN.
- Where, m no of levels of expressways
- Example
- If K4 and N220 (million nodes), then
- m 10 and takes only few hundred bytes
- to store the routing table.
- Therefore storage is not a concern.
30Analysis
Routing expressways involves at most m levels of
CAN like routing. Routing algorithm will iterate
through the table to get down to the level where
the expressway zone doesn't cover the
destination. It will the follow the CAN routing
to reach an expressway zone that covers the
destination. This process continues until the
destination is reached.
31Analysis
- Number of levels the routing will travel logkN
- The average routing hops in any level
(d/3)K(1/d) - No of hops in the worst case (logkN)(d/3)K(1/d)
- The bigger the K, the less levels of
expressways there are but, routing at each level
is more expensive, so we need to find the optimal
K.
32Analysis
- Let f(x)(d/3).(x1/d).logxN
- Taking the derivative and equating it to zero
- f(x) 0 gt(d/3).(1/d.x(1/d-1).lnN/lnx
- x(1/d-1).lnN/(lnx)2
- On simplification we get
- 1/d1/lnx
- gt x ed
- Thus the optimal K ed
- Optimal routing performancef(x)(d/3).((ed)1/d
).log(ed)N - d/3.e.lnN/d
- e/3 lnN
- Thus the routing performance of CAN
- using expressways is O(lnN). And it is
independent of choice of d.
33Expressway Performance Compared With Theoretical
Values
Comparison of theoretical performance using
expressway with results obtained using
simulation.
34Comparison of Expressway with CAN
Expressway with lowest dimension outperforms the
basic CAN
35Tuning Towards Best Performance
- The representative of each expressway zone can
be changed on the fly - Tuning towards network conditions
- Locating closest nodes using coordinate maps
- Build maps of network coordinates, and
utilizing the archival capacity of P2P system
itself to store and maintain the maps in various
zones in CAN. Any node can find a resident close
to it in a given expressway zone Z by consulting
the appropriate map
36Tuning to Network Conditions
- Devise a hash function that maps positions in
N/W co-ordinate space to points in a CAN zone
37Improving Co-ordinate Maps
38Ratio of Physical latencies of CAN using
co-ordinate maps to those of ideal CAN
Improvements are close to 50 and stay 2-3 range
of the ideal
39Average data latency per hop
With expressway physical distance of each logical
hop increases whereas for ideal case the
reverse is true.
40Route around a congested node
- Two Steps for handling Congestion
- Detect and Report Congestion
- Explicit congestion notification for congestion
avoidance - Active probing for congestion discovery
- Evaluate alternate routes
- Additional Method
- Integrate Congestion control as part
of the process - of locating most appropriate
expressway neighbor.
41Tuning According to Application Needs
- Reducing the per hop distance may not necessarily
reduce the total delay. - So, the total routing table of node keeps the
addresses of multiple nodes, instead of keeping
the address of the node that is physically
closest to it. - And while routing to a destination, we attempt to
balance the per hop distance and the total number
of logical hops to achieve overall small latency. - Routing performance can be further improved by
allocating some storage at each node to cache
routes based on runtime behavior.
42Heuristics
- When choosing a candidate y in node xs total
routing table to route from x to destination d
- Minimize the terms
- physical_distance(x,y) ratio_to_ideal
ideal_distance(y,d) - Where,
- physical_distance(x,y) physical distance
between x - and
expressway neighbor y - ideal_distance(y,d) estimates the ideal
distance between (y,d) - ratio_to_ideal ratio of avg.
physical dist. using map to - that of
ideal case
43Percentage improvement over IP optimization
This optimization reduces physical distance up to
19
44Contributions
- Using expressways the performance of routing in
CAN is improved from O(N1/d) to O(ln N). - Evaluation of the techniques show that
-
- Tuning the expressway to network conditions
improve the average physical routing distance by
about 50 - Tuning the expressway to application behavior
further reduce physical latency up to 19. -