Discovery and Consistent Hashing - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Discovery and Consistent Hashing

Description:

Platform (semi-static): may leave or rejoin network but stays in a static geographic location. ... to explain (to prepare s ) is node joining and rejoining. ... – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 22
Provided by: Chr116
Category:

less

Transcript and Presenter's Notes

Title: Discovery and Consistent Hashing


1
Discovery and Consistent Hashing
  • Chris Cabanne ccabanne_at_uci.edu

2
Discovery and Cyber Entity Lookup
  • Primary challenge efficiently locating services
    (i.e. Cyber-Entities) across BioNet platforms in
    a decentralized manner.
  • Methods
  • Relationship Discovery
  • Indexing

3
DNS and the Internet
  • DNS domain name space.
  • Host Name to IP address Mapping.
  • i.e. www.uci.edu ? 128.200.222.100
  • Indexing is done in a hierarchical manner.
  • Request --gt Root Server ? Master Name Server
    (.edu) ? return address.
  • Addressed used in network layer routing.
  • Constraints
  • Semi-static network IP addresses do not move
    geographically
  • Possible point of failure Root DNS servers.

4
DNS and BioNet
  • Platform (semi-static) may leave or rejoin
    network but stays in a static geographic
    location.
  • CE (dynamic) may migrate from one platform to
    the next.
  • How do we keep track of CEs location?
  • CE may migrate.
  • CE may die.
  • There may be several cached copies of a Cyber
    Entity.
  • Locating a unique CE vs. locating a CE that is a
    member of a group.

5
Consistent Hashing
  • One possible solution Consistent Hashing
  • Originally a distributed caching technique to
    relieve Hot Spots on the Internet.
  • Benefits
  • Flexible Naming No imposed naming structure of
    CE
  • Decentralization No central control.
  • Availability Allows for the location of
    migrating CE

6
Some networks protocals that implement consistent
hashing
  • OceanStore http//oceanstore.cs.berkeley.edu/pub
    lications/papers/abstracts/silverback_sosp_tr.html
  • Chord http//www.pdos.lcs.mit.edu/chord/
  • Grid http//www.pdos.lcs.mit.edu/grid/

7
Consistent Hashing Claims
  • Claim In an network with N nodes (N BioNet
    platforms) and K keys (Cyber Entities)
  • Every Node responsible for holding O( K / N )
    keys.
  • Resolves all CE lookups via O( log N ) messages.
  • Each Node only maintains routing information to
    O( log N ) other nodes.

8
Consistent Hashing Claims (Continued)
  • Claims (Continued)
  • Nodes joining or leaving network maintain
    routing information between nodes in O( log² N )
    messages.
  • Guarantee correct lookup of CE with only one
    piece of correct routing information.
  • Ok, but how does it work?

9
Consistent Hashing
  • Basic Idea given a key, map the key onto a node.
  • All nodes and keys are assigned an m-bit
    identifier using a base hash function (like
    SHA-1)
  • A nodes identifier (must be unique) is chosen by
    hashing the nodes IP address.
  • A keys identifier is produced by hashing the key.

10
Assigning Keys to Nodes
  • Identifiers are ordered in a circular number line
    0 to 2m - 1.
  • Key k is assigned to the first node whose
    identifier is equal to or follows ks identifier
    in the number line.

11
Example of number line
m 3 There are three nodes 0, 1, and 3. There
are three keys 1, 2, and 6 Key 1 is located at
node 1, key 2 is located at node 3, key 6 at node
0.
6
1
0
7
1
2
6
3
5
4
2
12
Scalable Key Location
  • Only a small amount of routing information needs
    to be contained at each node for correctness.
  • If each Node keeps a pointer to the next node on
    the number line, then the network is fully
    connected every node can reach every other node
    (although slowly).
  • To speed things up, every node contains routing
    data to a small amount of other nodes.

13
Node Routing table
  • Each node n contains a table that has at most m
    entries (the bit length of identifier hash). This
    is because there can be at most 2m unique nodes.
  • The ith entry contains the identity of the first
    node, s, that succeeds n by at least 2i-1.
    Calculated as ( n 2i-1 ) 2m, 1 lt i lt m.
  • A table entry contains both the node identifier
    number and the IP address of the node.

14
Characteristics of Node Table
  • Each node stores information about only a small
    number of other nodes.
  • A node has more information about the nodes
    closely following it on the identifier circle
    than about nodes farther away.
  • A node table generally does not contain enough
    information to determine the successor of and
    arbitrary key.
  • A Node can forward the request to another node in
    its table that has more information about a keys
    identifier.

15
Searching
  • Node n searches its table for the node j whose
    identifier most immediately precedes key ks
    identifier.
  • n passes the query onto j.
  • Process is repeated until the node responsible
    for ks identifier is found.

16
Table for Node 1
Start 2 3 5
Int. 2,3 3,5 5,1
Succ. 3 3 0
6
1
0
7
1
2
6
3
5
4
2
17
Big Picture
  • Cyber Entities (a key) will create an identifier
    number based on keywords or a unique Cyber Entity
    ID.
  • Next, the cyber entitys location data, will be
    mapped to the proper BioNet platform (node).

18
Big Picture (continued)
  • This location data is the IP address of the
    BioNet platform the CE is currently residing.
  • If the CE migrates, O( log n ) messages are sent
    to update the platform that holds the CEs
    location data.

19
Other aspects to Consitent Hashing
  • More complicated to explain (to prepare slides ?)
    is node joining and rejoining.
  • Node failure, etc.

20
Future Work
  • Analyze network traffic associated when many CEs
    are migrating.
  • Analyze what happens when CE location data is
    lost, in a node failure.

21
0
7
1
2
6
3
5
4
Write a Comment
User Comments (0)
About PowerShow.com