There and Back Again - PowerPoint PPT Presentation

About This Presentation

Title:

There and Back Again

Description:

... networks to join into larger networks, rather than the reverse, for merging in ... put us in touch with the designers of the Chord protocol themselves, via phone! ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 20

Provided by: sram4

Learn more at: http://www.aladdin.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: There and Back Again

1
There and Back Again

or
How I Learned to Stop Worrying and Love the Chord
Protocol

By Sridhar Ramesh, working with Bob Harper, Matt
Kehrt, and Tom Murphy
2
The Chord Protocol, Recap

A method of assigning to different nodes on a
network responsibility for different keys and
their associated data
Nodes are arranged in positions on a ring, and
responsibilities are determined by these
positions
Nodes maintain pointers to their successors on
the ring, as well as to select other nodes
farther away on the ring
Stabilization procedures are used to maintain the
ring structure in various situations
Can be used as the low-level underpinnings for a
distributed hash table system

3
Recap

The Chord protocol allows us to manage
distributed hash tables in a manner that is both
efficient and relatively robust
Lookups, with high probability, take time
logarithmic in the size of the network
The amount of work done is approximately evenly
distributed over all the nodes. No particular
node is responsible for excessively many keys,
should expect to see a significantly larger
amount of traffic, or is in any other way vital
to the system
This leads to the failure-tolerance of the Chord
protocol, as well as its ability to easily adapt
to the growing of a network. Nodes can join and
leave (whether intentionally or through failure)
with the necessary ring structure being
maintained and only minimal loss of data in the
case of node failures (higher-level structures
using the Chord protocol can be built with
redundancy to mitigate the otherwise unavoidable
damage of node failures).

4
Recap

However, the Chord protocol does not give us any
special way to merge two (or more) disjoint Chord
rings
We might want to do such a thing in the case of a
network partition the automatic stabilizing of
the Chord protocol would create two separate
rings during the partition, but, afterwards, we
would want to quickly merge them back into one

5
Recap

Potentially desirable features in such a merging
algorithm
It should be able to run concurrently with
applications using the Chord network without too
much disruption. For example, it would not do to
fail on all lookups during the merging period, or
disallow joins during this time, etc.
Time efficiency the less time to complete
merging, of course, the better

6
Previously, on Aladdin REU

I had proposed a scheme, using ghost nodes and
primary nodes, which would allow the members of
one Chord network to maintain their membership in
that ring while simultaneously moving to new
rings.
In this way, the nodes of a network can be
smoothly merged into another, without disrupting
uses of either network
Eventually, the no longer necessary ghost copies
of a node could be destroyed, and their ghost
networks with them, leaving only the one new
merged network

7
Ghosts in the Machine

The scheme was not yet fully fleshed out. It was
not quite resolved how to determine when a node
should merge into a new network
If just one node on network A decided to move to
network B, it could be arranged for all the nodes
on network A to eventually be informed of this
and move as well. Indeed, this could be done in
logarithmic time (in the size of A).
But how do we decide whether nodes should move
from A to B or B to A?

8
Size Does Matter

Naturally, we would prefer for smaller networks
to join into larger networks, rather than the
reverse, for merging in this manner would be more
efficient and take less time to complete (there
would be less ghost nodes to create and
eventually destroy, plus it would take less time
for the command to merge to be given to every
member of the smaller network)
So, expanding on this, the common sense thing to
do would be to have all nodes in smaller networks
move into the largest network during the merging
This would require having some way of determining
the size of a network, but it wouldnt be
terribly difficult to store or calculate
reasonable approximations of this size

9
Rounding Errors

If networks are suitably different in size, this
would work. Any reasonable estimates of their
sizes would result in the same conclusion as to
which was the larger, and thus all nodes would
agree to move away from the smaller network
However, if the two networks were close in size,
this would not work so well.
Different nodes giving different estimates of
sizes could result in different opinions as to
which network was the larger.

10
Round and Around Errors

Is it really all that problematic if nodes
disagree on which network is bigger?
Yes. Suppose some nodes from network A think
network B is larger. At the same time, some nodes
from network B think A is larger. The nodes from
A will move to B while the nodes from B will move
to A. After all this moving, we will still have
two disjoint networks. We will not have made any
progress towards merging!

11
Gimme a Break

What is needed is some form of symmetry-breaking,
to deterministically pick one particular network
out of a given pair of approximately
equally-sized networks.
No approach based on the topology of the networks
alone will do, since both networks could be
identical in this respect. Indeed, this shows us
the problem with using size alone to make this
choice in the first place.
The one area in which separate networks are
guaranteed to differ, though, is in the
identifiers (essentially the IP addresses) of
their nodes.
Aha! Of course, we can do the simplest thing
possible and give special significance to the
identifier of, say, the node responsible for key
0, for symmetry-breaking thus, in the case of
networks of approximately equal size, we could
compare these identifiers, instead of the size
estimates, in order to make our merging decision

12
A Break (not the kind we want)

Unfortunately
In order to determine, via the Chord protocol,
the node responsible for key 0 (and keep this
determination accurate in the face of possible
joins and leaves), messages are sent to that node
and the nodes near it
If every computer on a network is trying to
figure this out during a merge attempt, the node
responsible for key 0, and to a smaller extent
the nodes near it, will get far more than the
usual amount of network traffic.
Indeed, this massive increase in traffic could be
enough to actually knock this node offline, both
problematic in itself and because the
reorganizing of the ring will subsequently result
in a new node being responsible for key 0, which
could result in a switch of which network should
merge into which

13
Hm

Well, thats not too good
After running repeatedly into problems like this,
I went, along with Tom, to talk to David
Anderson, who was more familiar with the Chord
protocol and who could give us some advice on
whether we were reinventing the wheel or such

14
Advice

David wasnt familiar offhand with any particular
merging work or particularly handy
symmetry-breaking scheme for the Chord protocol
for us
So he kindly put us in touch with the designers
of the Chord protocol themselves, via phone!
But they werent familiar with any such work
either.
In fact when asked how they would deal with
merges, they said (paraphrasing) Oh nothing
special, just the natural ad hoc thing to do.
When we realize theres a partition, we let nodes
lookup their successor on the other network, and
then adjust their successor pointer if necessary.
Then, after a while, normal Chord stabilization
takes care of everything.

15
!!!!

It sounds so wrong.
It makes piecemeal out of the ring structures and
violates the Chord invariant (that once node A
can chase pointers to reach node B, it can always
do so)
As a result of this, network disruption is
potentially high during the merge, as data which
was once accessible from a particular network can
become completely lost to it until the merge
completes
Theres no particularly strong theoretical
backing for it which were aware of
But

16
????

It may just be good enough for us anyway
Network partitions are rare events, so we wont
be spending much time merging anyway
As a result, the disruption caused by this method
of merging may be tolerable in practice. If the
program using Chord is appropriately configured
to retry failed lookups after a wait if it
suspects a network partition is occurring, and
merging gets done in a reasonable amount of time,
the only problem at all will be a slight speed
hit
In fact, the way Chord is used in Conductor is
already suitable for this rather than performing
lookups, callbacks are set up, where a node will
register interest in a key and then be contacted
once that keys data is available. Therefore,
Conductor, using this merging procedure, should
never have problems with failed lookups which
should succeed it will only have some callbacks
take longer than usual in the rare event of a
network partition
The previous merging algorithm was getting
unwieldily complex, especially to reason about
simpler is, as always, better

However, in order to make sure this is, indeed,
good enough for us, we have to do some testing
It can be hard to do useful tests on real
networks, as setting up large networks is a chore
and nondeterminism makes it difficult to
reproduce results
Working with Matt, a simulated network system was
set up and tests of this merging procedure were
set up to run on it
Unfortunately, the tests gave the simulated
network its first strong workout and exposed a
few bugs. Weve managed to get rid of most of
them, but one last one has been nagging us for
quite a while, preventing us from gathering much
hard data yet. Still, apart from the technical
hassle of debugging, were very close to knowing
whether or not the simple merging procedure
suggested by the Chord designers themselves is,
in fact, good enough for us

18
The Future

While(Tests are buggy)
Debug tests
If(Tests show the simple merging to be good
enough)
Then celebrate and implement
Else maybe try to design yet another merging
algorithm
( One potentially intriguing idea, introduced by
Tom When a node wants to switch networks, it
consults about 2n neighbors for advice, and they
either stay or move together, making random
choices if network sizes are close. As time goes
on, n increases, and thus the probability that
all the nodes have ended up on one network tends
to 1. Indeed, after a while, the network sizes
will be disparate with high probability anyway,
and at this point symmetry-breaking is no longer
a problem )