Title: Discovery and Consistent Hashing
1Discovery and Consistent Hashing
- Chris Cabanne ccabanne_at_uci.edu
2Discovery and Cyber Entity Lookup
- Primary challenge efficiently locating services
(i.e. Cyber-Entities) across BioNet platforms in
a decentralized manner. - Methods
- Relationship Discovery
- Indexing
3DNS and the Internet
- DNS domain name space.
- Host Name to IP address Mapping.
- i.e. www.uci.edu ? 128.200.222.100
- Indexing is done in a hierarchical manner.
- Request --gt Root Server ? Master Name Server
(.edu) ? return address. - Addressed used in network layer routing.
- Constraints
- Semi-static network IP addresses do not move
geographically - Possible point of failure Root DNS servers.
4DNS and BioNet
- Platform (semi-static) may leave or rejoin
network but stays in a static geographic
location. - CE (dynamic) may migrate from one platform to
the next. - How do we keep track of CEs location?
- CE may migrate.
- CE may die.
- There may be several cached copies of a Cyber
Entity. - Locating a unique CE vs. locating a CE that is a
member of a group.
5Consistent Hashing
- One possible solution Consistent Hashing
- Originally a distributed caching technique to
relieve Hot Spots on the Internet. - Benefits
- Flexible Naming No imposed naming structure of
CE - Decentralization No central control.
- Availability Allows for the location of
migrating CE
6Some networks protocals that implement consistent
hashing
- OceanStore http//oceanstore.cs.berkeley.edu/pub
lications/papers/abstracts/silverback_sosp_tr.html
- Chord http//www.pdos.lcs.mit.edu/chord/
- Grid http//www.pdos.lcs.mit.edu/grid/
7Consistent Hashing Claims
- Claim In an network with N nodes (N BioNet
platforms) and K keys (Cyber Entities) - Every Node responsible for holding O( K / N )
keys. - Resolves all CE lookups via O( log N ) messages.
- Each Node only maintains routing information to
O( log N ) other nodes.
8Consistent Hashing Claims (Continued)
- Claims (Continued)
- Nodes joining or leaving network maintain
routing information between nodes in O( log² N )
messages. - Guarantee correct lookup of CE with only one
piece of correct routing information. - Ok, but how does it work?
9Consistent Hashing
- Basic Idea given a key, map the key onto a node.
- All nodes and keys are assigned an m-bit
identifier using a base hash function (like
SHA-1) - A nodes identifier (must be unique) is chosen by
hashing the nodes IP address. - A keys identifier is produced by hashing the key.
10Assigning Keys to Nodes
- Identifiers are ordered in a circular number line
0 to 2m - 1. - Key k is assigned to the first node whose
identifier is equal to or follows ks identifier
in the number line.
11Example of number line
m 3 There are three nodes 0, 1, and 3. There
are three keys 1, 2, and 6 Key 1 is located at
node 1, key 2 is located at node 3, key 6 at node
0.
6
1
0
7
1
2
6
3
5
4
2
12Scalable Key Location
- Only a small amount of routing information needs
to be contained at each node for correctness. - If each Node keeps a pointer to the next node on
the number line, then the network is fully
connected every node can reach every other node
(although slowly). - To speed things up, every node contains routing
data to a small amount of other nodes.
13Node Routing table
- Each node n contains a table that has at most m
entries (the bit length of identifier hash). This
is because there can be at most 2m unique nodes.
- The ith entry contains the identity of the first
node, s, that succeeds n by at least 2i-1.
Calculated as ( n 2i-1 ) 2m, 1 lt i lt m. - A table entry contains both the node identifier
number and the IP address of the node.
14Characteristics of Node Table
- Each node stores information about only a small
number of other nodes. - A node has more information about the nodes
closely following it on the identifier circle
than about nodes farther away. - A node table generally does not contain enough
information to determine the successor of and
arbitrary key. - A Node can forward the request to another node in
its table that has more information about a keys
identifier.
15Searching
- Node n searches its table for the node j whose
identifier most immediately precedes key ks
identifier. - n passes the query onto j.
- Process is repeated until the node responsible
for ks identifier is found.
16Table for Node 1
Start 2 3 5
Int. 2,3 3,5 5,1
Succ. 3 3 0
6
1
0
7
1
2
6
3
5
4
2
17Big Picture
- Cyber Entities (a key) will create an identifier
number based on keywords or a unique Cyber Entity
ID. - Next, the cyber entitys location data, will be
mapped to the proper BioNet platform (node).
18Big Picture (continued)
- This location data is the IP address of the
BioNet platform the CE is currently residing. - If the CE migrates, O( log n ) messages are sent
to update the platform that holds the CEs
location data.
19Other aspects to Consitent Hashing
- More complicated to explain (to prepare slides ?)
is node joining and rejoining. - Node failure, etc.
20Future Work
- Analyze network traffic associated when many CEs
are migrating. - Analyze what happens when CE location data is
lost, in a node failure.
210
7
1
2
6
3
5
4