Discovery and Consistent Hashing - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Discovery and Consistent Hashing

Description:

Platform (semi-static): may leave or rejoin network but stays in a static geographic location. ... to explain (to prepare s ) is node joining and rejoining. ... – PowerPoint PPT presentation

Number of Views:145

Avg rating:3.0/5.0

Slides: 22

Provided by: Chr116

Category:

more less

Transcript and Presenter's Notes

Title: Discovery and Consistent Hashing

1
Discovery and Consistent Hashing

Chris Cabanne ccabanne_at_uci.edu

2
Discovery and Cyber Entity Lookup

Primary challenge efficiently locating services
(i.e. Cyber-Entities) across BioNet platforms in
a decentralized manner.
Methods
Relationship Discovery
Indexing

3
DNS and the Internet

DNS domain name space.
Host Name to IP address Mapping.
i.e. www.uci.edu ? 128.200.222.100
Indexing is done in a hierarchical manner.
Request --gt Root Server ? Master Name Server
(.edu) ? return address.
Addressed used in network layer routing.
Constraints
Semi-static network IP addresses do not move
geographically
Possible point of failure Root DNS servers.

4
DNS and BioNet

Platform (semi-static) may leave or rejoin
network but stays in a static geographic
location.
CE (dynamic) may migrate from one platform to
the next.
How do we keep track of CEs location?
CE may migrate.
CE may die.
There may be several cached copies of a Cyber
Entity.
Locating a unique CE vs. locating a CE that is a
member of a group.

5
Consistent Hashing

One possible solution Consistent Hashing
Originally a distributed caching technique to
relieve Hot Spots on the Internet.
Benefits
Flexible Naming No imposed naming structure of
CE
Decentralization No central control.
Availability Allows for the location of
migrating CE

6
Some networks protocals that implement consistent
hashing

OceanStore http//oceanstore.cs.berkeley.edu/pub
lications/papers/abstracts/silverback_sosp_tr.html
Chord http//www.pdos.lcs.mit.edu/chord/
Grid http//www.pdos.lcs.mit.edu/grid/

7
Consistent Hashing Claims

Claim In an network with N nodes (N BioNet
platforms) and K keys (Cyber Entities)
Every Node responsible for holding O( K / N )
keys.
Resolves all CE lookups via O( log N ) messages.
Each Node only maintains routing information to
O( log N ) other nodes.

8
Consistent Hashing Claims (Continued)

Claims (Continued)
Nodes joining or leaving network maintain
routing information between nodes in O( log² N )
messages.
Guarantee correct lookup of CE with only one
piece of correct routing information.
Ok, but how does it work?

9
Consistent Hashing

Basic Idea given a key, map the key onto a node.
All nodes and keys are assigned an m-bit
identifier using a base hash function (like
SHA-1)
A nodes identifier (must be unique) is chosen by
hashing the nodes IP address.
A keys identifier is produced by hashing the key.

10
Assigning Keys to Nodes

Identifiers are ordered in a circular number line
0 to 2m - 1.
Key k is assigned to the first node whose
identifier is equal to or follows ks identifier
in the number line.

11
Example of number line
m 3 There are three nodes 0, 1, and 3. There
are three keys 1, 2, and 6 Key 1 is located at
node 1, key 2 is located at node 3, key 6 at node
0.
6
1
0
7
1
2
6
3
5
4
2
12
Scalable Key Location

Only a small amount of routing information needs
to be contained at each node for correctness.
If each Node keeps a pointer to the next node on
the number line, then the network is fully
connected every node can reach every other node
(although slowly).
To speed things up, every node contains routing
data to a small amount of other nodes.

13
Node Routing table

Each node n contains a table that has at most m
entries (the bit length of identifier hash). This
is because there can be at most 2m unique nodes.
The ith entry contains the identity of the first
node, s, that succeeds n by at least 2i-1.
Calculated as ( n 2i-1 ) 2m, 1 lt i lt m.
A table entry contains both the node identifier
number and the IP address of the node.

14
Characteristics of Node Table

Each node stores information about only a small
number of other nodes.
A node has more information about the nodes
closely following it on the identifier circle
than about nodes farther away.
A node table generally does not contain enough
information to determine the successor of and
arbitrary key.
A Node can forward the request to another node in
its table that has more information about a keys
identifier.

15
Searching

Node n searches its table for the node j whose
identifier most immediately precedes key ks
identifier.
n passes the query onto j.
Process is repeated until the node responsible
for ks identifier is found.

16
Table for Node 1
Start 2 3 5
Int. 2,3 3,5 5,1
Succ. 3 3 0
6
1
0
7
1
2
6
3
5
4
2
17
Big Picture

Cyber Entities (a key) will create an identifier
number based on keywords or a unique Cyber Entity
ID.
Next, the cyber entitys location data, will be
mapped to the proper BioNet platform (node).

18
Big Picture (continued)

This location data is the IP address of the
BioNet platform the CE is currently residing.
If the CE migrates, O( log n ) messages are sent
to update the platform that holds the CEs
location data.

19
Other aspects to Consitent Hashing