Epidemic Protocols - PowerPoint PPT Presentation

About This Presentation
Title:

Epidemic Protocols

Description:

Epidemic algorithms for replicated database maintenance; Alan ... Solution; Jim Gray, Pat Helland, Patrick O'Neil, Dennis Sasha, SIGMOD 1996 ( Read in CS632) ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 31
Provided by: testz4
Category:

less

Transcript and Presenter's Notes

Title: Epidemic Protocols


1
Epidemic Protocols
  • CS614
  • March 7th 2002
  • Ashish Motivala

2
Papers
  • Epidemic algorithms for replicated database
    maintenance Alan Demers, Dan Greene, Carl
    Hauser, Wes Irish and John Larson Proceedings of
    the Sixth Annual ACM Symposium on Principles of
    distributed computing , 1987
  • Bimodal multicast Kenneth P. Birman, Mark
    Hayden, Oznur Ozkasap, Zhen Xiao, Mihai Budiu and
    Yaron Minsky ACM Trans. Comput. Syst. 17, 2
    (May. 1999)
  • Managing update conflicts in Bayou, a weakly
    connected replicated storage system D. B. Terry,
    M. M. Theimer, Karin Petersen, A. J. Demers, M.
    J. Spreitzer and C. H. Hauser SOSP1995.
  • Flexible update propagation for weakly consistent
    replication Karin Petersen, Mike J. Spreitzer,
    Douglas B. Terry, Marvin M. Theimer and Alan J.
    Demers SOSP, 1997,
  • Fighting fire with fire using randomized gossip
    to combat stochastic scalability limits Indranil
    Gupta, Kenneth P. Birman, Robbert van Renesse To
    appear, March, 2002
  • Dangers of Replication and a Solution Jim Gray,
    Pat Helland, Patrick ONeil, Dennis Sasha, SIGMOD
    1996 (ltlt Read in CS632)

3
Simple Epidemic
  • Assume a fixed population of size n
  • For simplicity, assume homogeneous spreading
  • Simple epidemic any one can infect any one with
    equal probability
  • Assume that k members are already infected
  • infection occurs in rounds

4
Probability of Infection
  • Probability Pinfect(k,n) that a particular
    uninfected member is infected in a round if k are
    already in a round if k are already infected?
  • Pinfect(k,n) 1 P(nobody infects member)
  • 1 (1 1/n)k
  • E(newly infected members) (n-k)x Pinfect(k,n)
  • Basically its a Binomial Distribution

5
2 Phases
  • Intuition 2 Phases
  • Infection
  • Initial Growth Factor
  • is very high about 2
  • Exponential growth
  • Uninfection
  • Slow death of uninfection
  • to start
  • Exponential decline
  • Number of rounds necessary to infect the entire
    population is O(log n)
  • First Half 1 -gt n/2 Phase 1
  • Second Half n/2 -gt n Phase 2
  • For large n, Pinfect(n/2,n) 1 (1/e)0.5 0.4

6
Applications for Epidemic Protocols
  • Reliable Multicast virtual synchrony, randomized
    rumour spreading.
  • Systems (Database Replication) Clearinghouse,
    Grapevine, Bayou
  • Membership and Failure Detection SWIM, SCAMP
  • Data Aggregation
  • Other distributed protocols leader election
    Lightweight Prob. broadcast delta reliability
    Li Li's work Kempe and Kleinberg's work
  • Our focus today

7
Grapevine and Clearinghouse
  • Weakly consistent replication was used at Xerox
    PARC
  • Grapevine and Clearinghouse name services
  • Updates are propagated by unreliable multicast
    (direct mail).
  • Periodic anti-entropy exchanges among replicas
    ensure that they eventually converge, even if
    updates are lost.
  • Arbitrary pairs of replicas periodically
    establish contact and resolve all differences
    between their databases.
  • Various mechanisms (e.g., MD5 digests and update
    logs) reduce the volume of data exchanged in the
    common case.
  • Deletions handled as a special case via death
    certificates recording the delete operation as
    an update.

8
Epidemic Algorithm Rumour Mongering
  • Each replica periodically touches a selected
    susceptible peer site and infects it with
    updates.
  • Transfer every update known to the carrier but
    not the victim in pull and vice versa in push.
    Rumours are dropped using counter or coins
    schemes.
  • Partner selection is randomized using a variety
    of heuristics. Distance vs. Convergence Tradeoff.
  • ie. If only neighbours are updated then link
    traffic is O(1) but convergence traffic is O(n).
  • Sites connect to others at distance d with
    probability d-a
  • Theory shows that the epidemic will eventually
    the entire population (assuming it is connected).
  • Heuristics (push vs. pull) affect traffic load
    and the expected time-to-convergence. Pull
    converges faster than push.
  • Pull pi1 (pi) 2
  • Push pi1 pi/e where pi prob. of a site
    being susceptible after i rounds (cycles)

9
Recap.
  • Two Reliable Multicast Models
  • SRM
  • Local repair of problems but no end-to-end
    guarantees
  • Virtual synchrony model (Isis, Horus, Ensemble)
  • All or nothing message delivery with ordering
  • Membership managed on behalf of group
  • State transfer to joining member
  • Great performance for small systems. In large
    group sizes, under perturbations (heavy load,
    applications acting little flakey) performance is
    very hard to maintain.

10
Multicast scaling issue (SRM)
11
Multicast scaling issue (Ensemble)
12
Bimodal Multicast
  • 2 Sub-protocols
  • Unreliable data distribution (IP multicast)
  • Upon arrival, a message enters the receivers
    message buffer.
  • Messages are delivered to the application layer
    in FIFO order, and are garbage collected out of
    the message buffer after some period of time.
  • The second sub-protocol is used to repair gaps in
    the message delivery record
  • processes maintain a list of a random subset of
    the full system membership. In practice, we
    weight this list to contain primarily processes
    from close by processes accessible over
    low-latency links.

13
Start by using unreliable multicast to rapidly
distribute the message. But some messages may not
get through, and some processes may be faulty.
So initial state involves partial distribution of
multicast(s)
14
Periodically (e.g. every 100ms) each process
sends a digest describing its state to some
randomly selected group member. The digest
identifies messages. It doesnt include them.
15
Recipient checks the gossip digest against its
own history and solicits a copy of any missing
message from the process that sent the gossip
16
Processes respond to solicitations received
during a round of gossip by retransmitting the
requested message. The round lasts much longer
than a typical RPC time.
17
Optimizations
  • Request retransmissions most recent multicast
    first
  • Idea is to catch up quickly leaving at most one
    gap in the retrieved sequence
  • Participants bound the amount of data they will
    retransmit during any given round of gossip. If
    too much is solicited they ignore the excess
    requests

18
Optimizations
  • Label each gossip message with senders gossip
    round number
  • Ignore solicitations that have expired round
    number, reasoning that they arrived very late
    hence are probably no longer correct
  • Dont retransmit same message twice in a row to
    any given destination (the copy may still be in
    transit hence request may be redundant)

19
Optimizations
  • Use IP multicast when retransmitting a message if
    several processes lack a copy
  • For example, if solicited twice
  • Also, if a retransmission is received from far
    away
  • Tradeoff excess messages versus low latency
  • Use regional TTL to restrict multicast scope

20
Bimodal Multicast and SRM with system wide
constant noise, tree topology
Repair requests (per sec)
21
(No Transcript)
22
Two predicates
  • Predicate I A faulty outcome is one where more
    than 10 but less than 90 of the processes get
    the multicast.
  • Predicate II A faulty outcome is one where
    roughly half get the multicast and failures might
    conceal true outcome

23
Bimodal Multicast is amenable to formal analysis
24
Unlimited scalability!
  • Probabilistic gossip routes around congestion
  • And probabilistic reliability model lets the
    system move on if a computer lags behind
  • Results in
  • Constant communication costs
  • Constant loads on links
  • Steady behavior even under stress

25
Good things?
  • Overcome Internet limitations using randomized
    P2P gossip
  • However, Internet routing can defeat our clever
    solutions unless we know network topology
  • Both have great scalability and can survive under
    stress
  • And both are backed by formal models as well as
    real code and experimental data

26
Further Work
27
Research Locations
  • Cornell Spinglass
  • http//www.cs.cornell.edu/Info/Projects/Spinglass
    /index.html
  • SWIM http//www.cs.cornell.edu/gupta/swim
  • MSR Cambridge (Kermarrec) http//research.microso
    ft.com/camdis/gossip.htm
  • EPFL (Guerraoui) http//lpdwww.epfl.ch/publicatio
    ns

28
Bayou Basics
  • The motivation for Bayou comes from observations
    of mobile computing.
  • Connections are expensive, frequent, and often
    intermittent.
  • Collaborating agents are likely to be guaranteed
    simultaneous connections.
  • Bayou accommodates these applications by helping
    them manage weakly consistent data. Bayou does
    not attempt to be transparent.

29
Bayou Basics (cont.)
  • Applications should use specific knowledge of
    their data, along with the knowledge that data
    may be stale, to detect and resolve conflicts.
  • Applications detect and resolve conflicts
    differently
  • Bayou allows for arbitrary dependencies,
    constraints, and detection of write/write and
    read/write conflicts.
  • Programs resolve conflicts with each write.
    Resolution may involve cascading back-outs.
  • Procedures must be deterministic so that they may
    be replayed on multiple machines.
  • A write is considered tentative until committed
    at the primary server.
  • A global ordering is used by the primary server
    to dictate which of several conflicting writes
    wins.
  • A modification is stable once it reaches the
    primary server.
  • Primary servers have authority, a tradeoff that
    allows data to become stable w/o hearing
    responses from all clients and servers.

30
Implementation
  • Two applications are studied, a bibliographic
    database and a meeting room scheduler.
  • Anti-entropy A client may connect to any server
    for reading and writing data.
  • Servers replicate all data, and synchronize using
    pair-wise communication.
  • Anti-entropy insures eventual consistency of the
    database (they "gossip"). A primary server is the
    authoritative source of consistency.
  • Implementation each server logs committed and
    tentative data. Anti-entropy sessions update
    these logs accordingly.
  • Access control and security security is achieved
    with public key cryptography, access control by
    allowing users to grant and revoke privileges.
    Primary servers are responsible for managing
    revocation lists.
Write a Comment
User Comments (0)
About PowerShow.com