Updates in Highly Unreliable, Replicated Peer-to-Peer Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Updates in Highly Unreliable, Replicated Peer-to-Peer Systems

Description:

Shared calendars/address books. Trust management. Medical information sharing. Replication is used to improve fault-tolerance and response time. ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 32
Provided by: Imho6
Category:

less

Transcript and Presenter's Notes

Title: Updates in Highly Unreliable, Replicated Peer-to-Peer Systems


1
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems
  • Anwitaman Datta, Manfred Hauswirth, Karl Aberer
  • (EPFL)
  • Presented by Zhiyuan Troy Zhan

2
Outline
  • Motivation
  • System Model Algorithm
  • Analytical Model Analysis
  • Related Work
  • Conclusion

3
Motivation
  • Peer-to-Peer System is not just about file
    sharing.
  • Data items can be added, deleted and updated
    frequently.
  • Peer commerce
  • Shared calendars/address books
  • Trust management
  • Medical information sharing
  • Replication is used to improve fault-tolerance
    and response time.

4
Motivation contd
  • How to disseminate updates to other peers is the
    target problem
  • Consistency guarantee
  • Scalability, underlying system assumption,
    resource consumption
  • Challenges
  • Huge number of peers
  • Peers can go online/offline at any time
  • Often lack of global knowledge

5
Motivation contd
  • Contributions
  • Address the update dissemination problem with low
    online probabilities of peers (lt30) and no
    global knowledge.
  • Present a fully decentralized, efficient and
    robust communication scheme based on rumor
    spreading.
  • A generic analytical model of combined push/pull
    technique.

6
Motivation - Problem Statement
  • Assumptions
  • Low percentage of online peers, impossible to
    achieve any kind of quorum.
  • Transactional consistency is not required,
    eventual consistency is desirable in most
    applications.
  • Update conflicts is very rare, the paper does not
    handle it.
  • Probabilistic guarantee of successful search are
    sufficient.
  • Total of replicas is substantially lower than
    total of peers (i.e. 1000 vs 1000,000).
  • Consecutive updates can be distributed sparsely
    over time.
  • Communication overhead is the major performance
    measurement.

7
Outline
  • Motivation
  • System Model Algorithm
  • Analytical Model Analysis
  • Related Work
  • Conclusion

8
System Model
  • A peer-to-peer overlay network.
  • Each peer has its own local knowledge, i.e.
    routing table, replica list, etc.
  • Peers can go offline at any time.
  • A communication channel can be established
    between any two online peers, otherwise, assume
    each other offline.

9
Algorithm Push Phase
  • Executed when disseminating updates.
  • At replica p, upon receiving message
    Push(U,V,Rf,t)
  • IF Push(U,V,Rf,t) not processed THEN
  • Select a random subset Rp of replicas with
    RpRfr
  • With probability PF(t), send Push(U, V,
    RfRpp, t1) to Rp-Rf
  • Set Push(U,V,Rf,t) as processed

10
Algorithm Push Phase,contd
  • U actual update data item
  • V version vector. Contains global version
    identifiers (GUID, can be computed locally),
    altered data items are treated as distinct
    coexist versions.
  • R, Rf, Rp replicas.
  • PF(t) a function of t.

11
Algorithm Pull Phase
  • Executed when a peer recovers from failure, or
    reconnects, or receives no updates for a while,
    or receives pull message but not sure whether
    itself is in sync.
  • Contact online replicas
  • Inquire for missed updates based on version
    vectors

12
Outline
  • Motivation
  • System Model Algorithm
  • Analytical Model Analysis
  • Related Work
  • Conclusion

13
General Assumptions
  • Assume an update U is initiated for R online
    replicas.
  • In general, the online population in push round
    t
  • Ron(t)Ron(t-1)xR-Ron(t-1)y
  • x1-p, yq
  • p probability of an online peer going offline in
    one push round q probability of an offline peer
    coming online in one push round.
  • p,q are typically small and may vary in different
    rounds.
  • ASSUME p is constant and omit peers coming
    online
  • Ron(t)Ron(t-1)x
  • ASSUME fr is constant.

14
Analysis of Pull Phase Round 0
  • Total number of messages msg(0)Rfr
  • New replicas which receive the update
    newreplicas(0)Ron(0)fr
  • Online replicas that do not receive the update
    Ron(0)(1-fr)
  • Message length (size, denote U as the update
    message size) ML(0)URfrB (B size of data
    required to describe one replica meta data),
    only consider U and Rf

15
Analysis of Pull Phase Round 1
  • of messages in round 1 msg(1)Ron(0)frxPF(1)
    Rfr(1-fr)
  • of replicas that newly pushed with updates
    after round 1 newreplicas(1)Ron(0)x(1-fr)
    1-(1-fr)Ron(0)frxPF(1)
  • Length of message ML(1)URB(frfr(1-fr))
    URB(1-(1-fr)2)

16
Analysis of Pull Phase Round tgt2
  • Define fd_aware(t) and faware(t)
  • fd_aware(t) Increment in fraction of online
    replicas which are aware of the update after
    round t
  • faware(t) Total fraction of online replicas
    which are aware of the update at the beginning of
    round t
  • faware(t) faware(t-1) fd_aware(t-1)

17
Analysis of Pull Phase Round tgt2, contd
  • newreplicas(t)Ron(t-1)(1-faware(t-1))x 1 -
    (1-fr)Ron(t-1)fd_aware(t-1)xPF(t) in paper
  • newreplicas(t)Ron(t-1)(1-faware(t))x 1 -
    (1-fr)Ron(t-1)fd_aware(t-1)xPF(t) I think
  • Given fd_aware(t)(1-faware(t))1-(1-fr)Ron(t-1)
    fd_aware(t-1)xPF(t) , we have
  • faware(t) faware(t-1) fd_aware(t-1)1-(1-faware(
    t-1))(1-fr)Ron(t-2)fd_aware(t-2)xPF(t-1) .
  • faware(t) rapidly grows to 1

18
Analysis of Pull Phase Round tgt2, contd
  • If the partial list is ignored
  • msg(t)Ron(t-1) fd_aware(t-1)xRF(t) Rfr
  • If the partial list is considered
  • msg(t)Ron(t-1) fd_aware(t-1)xRF(t)
    Rfr(1-fr)t - (1)
  • ML(t)URB(1-(1-fr)t1) - (2)
  • Both (1) and (2) are proved in the paper by
    induction on t.

19
Analysis of Pull Phase
  • Case1 a replica p comes online after a push
    phase is over
  • Trivial, assume other online replicas have got
    the update already.
  • Case2 p comes online during the push phase,
    suppose faware fraction of the replicas Ron are
    aware of the updates, the probability of p
    getting the update in m attempts is
  • 1-1-(Ron faware /R)m -(3)
  • Query similar to Pull, but may need majority
    logic, or version scheme, or hybrid of two, to
    identify the latest updates

20
Analytical Results
Varying initial online population Ron(0), 1
21
Analytical Results contd
Varying initial online population Ron(0), gt5
22
Analytical Results contd
23
Analytical Results contd
24
Analytical Results contd, Parameter tuning
25
Analytical Results contd, scalability
26
Discussions
  • Comparison with Gnutella
  • Parameter self-tuning (Optimization)

27
Outline
  • Motivation
  • System Model Algorithm
  • Analytical Model Analysis
  • Related Work
  • Conclusion

28
Related Work
  • Replication and updates in DB
  • iAnywhere Solutions Server-based approach
  • Bayou assumes significantly less replicas, less
    updates, disconnections are short
  • Some other approaches assume availability of
    resource and replicas in general

29
Related Work contd
  • Group communication and lazy epidemic schemes
  • Similar work Bimodal multicast, epidemic updates
  • None has done special case study of bimodal
    behavior and the utility of epidemic algorithms
    in a highly unreliable environment

30
Outline
  • Motivation
  • System Model Algorithm
  • Analytical Model Analysis
  • Related Work
  • Conclusion

31
Conclusion
  • This paper provides an analytical model to
    demonstrate the significant reduction of message
    overhead using combined push and pull
    techniques.
  • Totally decentralized solution, no global
    knowledge is needed.
  • The paper is available at citeseer. Will appear
    in ICDCS 2003.
Write a Comment
User Comments (0)
About PowerShow.com