Title: A scalable Content- Addressable Network
1A scalable Content- Addressable Network
- Sylvia Rathnasamy, Paul Francis, Mark Handley,
Richard Karp, Scott Shenker
Pirammanayagam Manickavasagam
2Overview
- Introduction
- Design
- Design Improvements
- Design Review
- Related works
- Discussion
3Introduction
- Hash Table Functionality
- Maps key to a value.
- Content Addressable Network (CAN) -
- Is a concept that provides distributed
infrastructure which has Hash Table like
functionality on Internet like Scale. - Characteristics
- scalable, fault-tolerant and completely
self-organizing.
4Introduction (cont..)
- Napster
- Locating a file is centralized.
- Gnutella
- Floods the request for a file, not scalable
- CAN provides a solution
- Scalable - Nodes maintain small amount of control
state - Distributed - Hash table is stored in all Peers,
so it is.
5Design
- Each node stores a chunk of hash table entry and
details of adjacent zones. - Requests are forwarded towards the CAN node that
contains the key. - Indexing uses virtual d-dimensional Cartesian
coordinates. - Coordinates are purely logical
6Coordinate Space
Each node randomly picks a coordinate. Coordinate
space is dynamically partitioned Each node owns
its individual zone
0,1
1,0
0,0
7Design (cont..)
- Inserting a pair ( key K1, value V1)
- Use Hash function to map K1 to a point P1 in
space - Then this pair is stored in the Node that owns
the zone - Retrieving a value
- Need to know the key and use the key to identify
the node - Node learns and maintains the table of details of
adjacent nodes.
8Routing
- Information's needed for routing
- CAN node hold routing table that contains IP
address and its virtual coordinate space. - Neighbor is determined if one of the d-dimension
is same and another dimension abuts. - For a d-dimensional coordinate individual node
maintains 2d neighbors
9In figure nodes 51 are neighbors, as 5 has same
Y coordinates as 1 and X coordinate abut 1s.
10Routing (Cont..)
- CAN message has destination address
- By simple greedy forwarding to the neighbor
closest to the destination it proceeds it
routing. - average path length (d/4)n1/d hops. ( n - of
zones) - As many path is available, network sustains even
if some node fails.
11Construction
- 1. First the new node must find a node already in
the CAN. - 2. Next, using the CAN routing mechanisms, it
must find a node whose zone will be split. - 3. Finally, the neighbors of the split zone must
be notified so that routing can include the new
node.
12Bootstrap
- From DNS domain name, one or more bootstrap nodes
is determined. - A bootstrap node maintains a partial list of CAN
nodes it believes are currently in the system. - TO join a CAN, a new node looks up the CAN domain
name in DNS to retrieve a bootstrap nodes IP
address. - This bootstrap node then supplies the IP address
of several randomly chosen nodes currently in
system.
13Finding a zone
- New node randomly chooses a point (p) in space.
- Sends JOIN request destined for P.
- This is sent into CAN via existing CAN node.
- Current occupant node then splits its zone in
half and assigns one half to the new node. - Splitting is done by assuming certain order.
- Eg, in 2 d, X coordinate splits first and then Y
coordinate.
14Maintenance
- Departure of a Node
- Single Node Failure
- Multiple Failure
15Departure of a Node
- The node that departs hands over the details to
the one of its neighbor. - If the zone of one of the neighbors can be merged
with the departing nodes zone to produce a valid
single zone, then this is done. - If not, then the zone is handed to the neighbor
whose current zone is smallest, and that node
will then temporarily handle both zones.
16Departure of a Node
When node F fails, E will be merged with F
0,1
1,0
0,0
17Failures
- Prolonged absence of update message will indicate
the failure of a node. - Neighbor node starts a takeover timer running.
- When the timer expires, a node sends a TAKEOVER
message conveying its own zone volume to all of
the failed nodes neighbors. - It accepts the TAKEOVER only if the zone volume
in the message is smaller than its own zone
volume. - Otherwise it sends its TAKEOVER message.
18Multiple Failure
- First does a ring search to get the unreachable
nodes. - Then rebuilds neighbor state table to do safe
takeover.
19Design Improvements
- Multi-dimensioned coordinate spaces
- Increasing the dimensions of the CAN coordinate
space reduces the routing path length, and hence
the path latency. - Increase in Dimension gt increase in neighbor gt
increase in routing gt increases routing fault
tolerance
20(No Transcript)
21Design Improvements
- Realities multiple coordinate spaces
- Each node maintain multiple, independent
coordinate spaces with each node in the system.
Each such coordinate space is a reality. - Given a coordinate, it is searched in all
realities. - This reduces the average path length.
- Multiple dimensions vs. multiple realities
- Multiple Reality has increased fault tolerance
and data availability than multiple dimensions.
22Design Improvements
- Overloading coordinate zones
- allow multiple nodes to share the same zone.
Nodes that share the same zone are termed peers. - MAXPEERS, which is the maximum number of
allowable peers per zone. - reduced path length (number of hops), and hence
reduced path latency - improved fault tolerance
- Multiple hash functions
- Almost equal to multi realities.
23Design Improvements
- Topologically-sensitive construction of the CAN
overlay network - CAN nodes are ordered with their round-trip-time
to each of landmarks. - With m landmarks, m! such orderings are possible.
- Every portion is assigned a landmark ordering.
- a new node joins the CAN at a random point in
that portion of the coordinate space associated
with its landmark ordering.
24Design Improvements
- More Uniform Partitioning
- Zone are split after comparing volume of its zone
with those of its immediate neighbors in the
coordinate space. - Zone with the largest volume is split.
- we can see that without the uniform partitioning
feature a little over 40 of the nodes are
assigned to zones with volume V as compared to
almost 90 with this feature and the largest zone
volume drops from 8V to 2V . - Not surprisingly, the partitioning of the space
further improves with increasing dimensions. - Caching and Replication techniques
25(No Transcript)
26Design Review
- Following metrics were used to evaluate system
performance - Path length the number of (application-level)
hops required to route between two points in the
coordinate space. - Neighbor-state the number of CAN nodes for which
an individual node must retain state. - Latency we consider both the end-to-end latency
of the total routing path between two points in
the coordinate space and the per-hop latency,
i.e., latency of individual application level
hops obtained by dividing the end-to-end latency
by the path length. - Volume the volume of the zone to which a node is
assigned that is indicative of the request and
storage load a node must handle. - Routing fault tolerance the availability of
multiple paths between two points in the CAN. - Hash table availability adequate replication of
a (key,value) entry to withstand the loss of one
or more replicas.
27Design Review
- The key design parameters affecting system
performance are - dimensionality of the virtual coordinate space d
- number of realities r
- number of peer nodes per zone p
- number of hash functions (i.e. number of points
per reality at which a (key, value) pair is
stored) k - use of the RTT-weighted routing metric
- use of the uniform partitioning
- Test system specification
- A system size of n218 nodes ,Transit-Stub
topology with delay of 100ms on intra-transit
links, 10ms on stub-transit links and 1ms on
intra-stub links (i.e. 100ms on links that
connect two transit nodes, 10ms on links that
connect a transit node to a stubnode and so
forth). - Transit-stub models explicitly group vertices
into domains, and reflect that grouping in the
connectivity between vertices.
28100 node transit-stub topology
29Bare bones CAN that does not utilize most of
our additional design features Knobs-on-full
CAN making full use of our added features
(without the landmark ordering feature)
30Related Work
- Related Algorithms
- Distance vector and Link State algorithms
- These need widespread topological information.
- CAN in other hand stores only less data.
- Plaxton algorithm
- Each node has n bit label divided into l levels.
- Each level has width w n/ l.
- Each node forwards a packet to a neighbor whose
label matches the destination label in more
digits.
31Related Work
- Algorithms with geographic routing.
- space in this algorithm refers to physical
space. - No neighbor search problem.
- Correctly mimic the space is a trivial problem
- It is not extensible to multi dimension
32Related System
- Domain Name System
- It stores (domain name, IP address).
- Ocean Store
- To provide continuous access to persistent
information - Uses Plaxtons algorithm
- Peer-to-Peer file sharing systems
- Freenet
- Stores Keys ( analogous URL ), address of other
nodes, data corresponding to key.
33Discussion
- Addresses two key problems in the design of
Content-Addressable Networks scalable routing
and indexing. - Simulation results validate the scalability of
our overall design for a CAN with over 260,000
nodes, we can route with a latency that is less
than twice the IP path latency. - Future works
- Secure CAN
- Key word searching