Updates in Highly Unreliable, Replicated Peer-to-Peer Systems - PowerPoint PPT Presentation

About This Presentation

Title:

Updates in Highly Unreliable, Replicated Peer-to-Peer Systems

Description:

Shared calendars/address books. Trust management. Medical information sharing. Replication is used to improve fault-tolerance and response time. ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 32

Provided by: Imho6

Learn more at: https://blough.ece.gatech.edu

Category:

more less

Transcript and Presenter's Notes

Title: Updates in Highly Unreliable, Replicated Peer-to-Peer Systems

1
Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems

Anwitaman Datta, Manfred Hauswirth, Karl Aberer
(EPFL)
Presented by Zhiyuan Troy Zhan

2
Outline

Motivation
System Model Algorithm
Analytical Model Analysis
Related Work
Conclusion

3
Motivation

Peer-to-Peer System is not just about file
sharing.
Data items can be added, deleted and updated
frequently.
Peer commerce
Shared calendars/address books
Trust management
Medical information sharing
Replication is used to improve fault-tolerance
and response time.

4
Motivation contd

How to disseminate updates to other peers is the
target problem
Consistency guarantee
Scalability, underlying system assumption,
resource consumption
Challenges
Huge number of peers
Peers can go online/offline at any time
Often lack of global knowledge

5
Motivation contd

Contributions
Address the update dissemination problem with low
online probabilities of peers (lt30) and no
global knowledge.
Present a fully decentralized, efficient and
robust communication scheme based on rumor
spreading.
A generic analytical model of combined push/pull
technique.

6
Motivation - Problem Statement

Assumptions
Low percentage of online peers, impossible to
achieve any kind of quorum.
Transactional consistency is not required,
eventual consistency is desirable in most
applications.
Update conflicts is very rare, the paper does not
handle it.
Probabilistic guarantee of successful search are
sufficient.
Total of replicas is substantially lower than
total of peers (i.e. 1000 vs 1000,000).
Consecutive updates can be distributed sparsely
over time.
Communication overhead is the major performance
measurement.

7
Outline

Motivation
System Model Algorithm
Analytical Model Analysis
Related Work
Conclusion

8
System Model

A peer-to-peer overlay network.
Each peer has its own local knowledge, i.e.
routing table, replica list, etc.
Peers can go offline at any time.
A communication channel can be established
between any two online peers, otherwise, assume
each other offline.

9
Algorithm Push Phase

Executed when disseminating updates.
At replica p, upon receiving message
Push(U,V,Rf,t)
IF Push(U,V,Rf,t) not processed THEN
Select a random subset Rp of replicas with
RpRfr
With probability PF(t), send Push(U, V,
RfRpp, t1) to Rp-Rf
Set Push(U,V,Rf,t) as processed

10
Algorithm Push Phase,contd

U actual update data item
V version vector. Contains global version
identifiers (GUID, can be computed locally),
altered data items are treated as distinct
coexist versions.
R, Rf, Rp replicas.
PF(t) a function of t.

11
Algorithm Pull Phase

Executed when a peer recovers from failure, or
reconnects, or receives no updates for a while,
or receives pull message but not sure whether
itself is in sync.
Contact online replicas
Inquire for missed updates based on version
vectors

12
Outline

Motivation
System Model Algorithm
Analytical Model Analysis
Related Work
Conclusion

13
General Assumptions

Assume an update U is initiated for R online
replicas.
In general, the online population in push round
t
Ron(t)Ron(t-1)xR-Ron(t-1)y
x1-p, yq
p probability of an online peer going offline in
one push round q probability of an offline peer
coming online in one push round.
p,q are typically small and may vary in different
rounds.
ASSUME p is constant and omit peers coming
online
Ron(t)Ron(t-1)x
ASSUME fr is constant.

14
Analysis of Pull Phase Round 0

Total number of messages msg(0)Rfr
New replicas which receive the update
newreplicas(0)Ron(0)fr
Online replicas that do not receive the update
Ron(0)(1-fr)
Message length (size, denote U as the update
message size) ML(0)URfrB (B size of data
required to describe one replica meta data),
only consider U and Rf

15
Analysis of Pull Phase Round 1

of messages in round 1 msg(1)Ron(0)frxPF(1)
Rfr(1-fr)
of replicas that newly pushed with updates
after round 1 newreplicas(1)Ron(0)x(1-fr)
1-(1-fr)Ron(0)frxPF(1)
Length of message ML(1)URB(frfr(1-fr))
URB(1-(1-fr)2)

16
Analysis of Pull Phase Round tgt2

Define fd_aware(t) and faware(t)
fd_aware(t) Increment in fraction of online
replicas which are aware of the update after
round t
faware(t) Total fraction of online replicas
which are aware of the update at the beginning of
round t
faware(t) faware(t-1) fd_aware(t-1)

17
Analysis of Pull Phase Round tgt2, contd

newreplicas(t)Ron(t-1)(1-faware(t-1))x 1 -
(1-fr)Ron(t-1)fd_aware(t-1)xPF(t) in paper
newreplicas(t)Ron(t-1)(1-faware(t))x 1 -
(1-fr)Ron(t-1)fd_aware(t-1)xPF(t) I think
Given fd_aware(t)(1-faware(t))1-(1-fr)Ron(t-1)
fd_aware(t-1)xPF(t) , we have
faware(t) faware(t-1) fd_aware(t-1)1-(1-faware(
t-1))(1-fr)Ron(t-2)fd_aware(t-2)xPF(t-1) .
faware(t) rapidly grows to 1

18
Analysis of Pull Phase Round tgt2, contd

If the partial list is ignored
msg(t)Ron(t-1) fd_aware(t-1)xRF(t) Rfr
If the partial list is considered
msg(t)Ron(t-1) fd_aware(t-1)xRF(t)
Rfr(1-fr)t - (1)
ML(t)URB(1-(1-fr)t1) - (2)
Both (1) and (2) are proved in the paper by
induction on t.

19
Analysis of Pull Phase

Case1 a replica p comes online after a push
phase is over
Trivial, assume other online replicas have got
the update already.
Case2 p comes online during the push phase,
suppose faware fraction of the replicas Ron are
aware of the updates, the probability of p
getting the update in m attempts is
1-1-(Ron faware /R)m -(3)
Query similar to Pull, but may need majority
logic, or version scheme, or hybrid of two, to
identify the latest updates

20
Analytical Results
Varying initial online population Ron(0), 1
21
Analytical Results contd
Varying initial online population Ron(0), gt5
22
Analytical Results contd
23
Analytical Results contd
24
Analytical Results contd, Parameter tuning
25
Analytical Results contd, scalability
26
Discussions

Comparison with Gnutella
Parameter self-tuning (Optimization)

27
Outline

Motivation
System Model Algorithm
Analytical Model Analysis
Related Work
Conclusion

28
Related Work

Replication and updates in DB
iAnywhere Solutions Server-based approach
Bayou assumes significantly less replicas, less
updates, disconnections are short
Some other approaches assume availability of
resource and replicas in general

29
Related Work contd

Group communication and lazy epidemic schemes
Similar work Bimodal multicast, epidemic updates
None has done special case study of bimodal
behavior and the utility of epidemic algorithms
in a highly unreliable environment

30
Outline

Motivation
System Model Algorithm
Analytical Model Analysis
Related Work
Conclusion

31
Conclusion

This paper provides an analytical model to
demonstrate the significant reduction of message
overhead using combined push and pull
techniques.
Totally decentralized solution, no global
knowledge is needed.
The paper is available at citeseer. Will appear
in ICDCS 2003.

Write a Comment

User Comments (0)