p2p06 - PowerPoint PPT Presentation

About This Presentation
Title:

p2p06

Description:

... Implementation If a p2p system uses ... Less durability/availability Types of Replication Caching vs ... Consistency Maintenance Make reads faster in ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 94
Provided by: ep58
Category:

less

Transcript and Presenter's Notes

Title: p2p06


1
Topics in Database Systems Data Management in
Peer-to-Peer Systems
PART 1 Replication and other issues
2
Agenda ??a s?µe?a
1. ?e????af? t?? e??as??? t?? µa??µat?? 2. Ge????
??a Replication 3. Replication Theory for
Unstructured (Cohen et al paper) 4. Epidemic
Algorithms for Updates (Demers et al paper)
3
Term Projects
  • ???as?e? t???? t?p??
  • ????? ??p??? e?e???t??? ?a?a?t??a ??e???eta?
    ?a s?efte?te
  • ?e? ?p???e? µ?a ??s? (??a t?? ?d?a e??as?a
    pa?ap??? ap? µ?a ?µ?de?)
  • Ta ??e?a 3 ?t?µa a?? ?µ?da
  • ?? ??ete ??p??a ???? ?d?a ???eta? a??? ???
    a?t?µata
  • Ta ft???ete µ?a web se??da ??a t? project t??
    ?p??a ?a µ?? ste??ete
  • replicate content and not index (for
    durability)!!!

4
Term Projects
??G?S?? ????? I Ta ep????ete
??a ????? ap? µ?a ??sta ap? ????a ?a ????a
af????? p??ί??µata d?a?e???s?? ded?µ???? e?te se
?e?t????p???µ??a s?st?µata e?te se ?ata?eµ?µ??a
s?st?µata ????? t?? ?d??t?te? t?? s?st?µ?t??
?µ?t?µ??. St???? t?? e??as?a? e??a? ? s?ed?as?
µ?a e?d???? t?? p??ί??µat?? ?at??????? ??a ??a
s?st?µa ?µ?t?µ?? ??µί??. ? e??as?a sa? ?a p??pe?
?a pe????e? µ?a µ??f? a???????s?? t?? p??s????s??
sa?. ??t? µp??e? ?a e??a? ?e???t??? (p?, e?t?µ?s?
p???p????t?ta? t?? ??s??, ap?de??? t?? ????t?ta?
? ????? ?d??t?t?? (p? e??s????p?s? f??t???) t??
??s??) ?/?a? ?a pe???aµί??e? µ?a µ???? ???p???s?.
Ta pa?ad?sete ??a ????? p?? ?a ??e? t?? µ??f?
e?e???t???? e??as?a? (?a d????? ?d???e?).
?p?s??, ?a pa???s??sete t?? e??as?a sa? st?
µ???µa (?a d????? ?d???e?).
5
Term Projects
  • ????a ??a t?? ???as?e? ??p?? ?
  • 1-3 ??a???te ?p???d?p?te (??a) ap? ta sections
    3, 4 ? 5 ap? t? M. Stonebraker, P. M. Aoki, W.
    Litwin, A. Pfeffer, A. Sah, J. Sidell C. Staelin
    and A. Yu. Mariposa A Wide-Area Distributed
    Database System. VLDB J., 5(1), 1996, 48-63.
  • 4 ??et?ste p?? t? pa?a??t? p?? s???t?saµe st?
    µ???µa µp??e? ?a p??sa?µ?ste? ??a p2p A. J.
    Demers, D. H. Greene, C. Hauser, W. Irish, J.
    Larson, S. Shenker, H. E. Sturgis, D. C.
    Swinehart, D. B. Terry Epidemic Algorithms for
    Replicated Database Maintenance. PODC 1987 1-12
  • 5 Te??e?ste µ?a ?ata?eµ?µ??? (p2p) e?d??? e???
    bitmap index.
  • G?a ta bitmap indexes µp??e?te ?a s?µί???e?te?te
    ?p???d?p?te ί?ί??? ί?se?? ded?µ???? ?/?a? t?
    pa?a??t?
  • P. E. O'Neil and D. Quass. Improved Query
    Performance with Variant Indexes. Proc. SIGMOD
    Conference, 1997, 38-49.
  • 6 ??et?ste p?? t? pa?a??t? p?? af??? sensor
    networks µp??e? ?a efa?µ?ste? se p2p s?st?µata
    D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos,
    V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas,
    D. Srivastava The Threshold Join Algorithm for
    Top-k Queries in Distributed Sensor Networks,
    DMSN Workshop, 2005.

6
Term Projects
??G?S?? ????? ?? Ta
ep????ete ??a ????? p?? af??? ??µata t?? pe??????
t?? s?st?µ?t?? ?µ?t?µ?? ??µί?? p?? de? ????µe
?a???e? st? µ???µa, s???e???µ??a (i) security,
(ii) trust/reputation, (iii) incentives, (iv)
publish-subscribe s?st?µata. ?a???s?as? t??
?????? st? µ???µa. (a) p??te??ete ??p??a
ep??tas? t?? ??????, p? efa?µ??? t?? se ???? t?p?
overlay, ίe?t??s? ??p???? ?a?a?t???st???? t??
??p. Se a?t?? t?? pe??pt?s?, ?a p??pe? ?a
s?µpe????ίete ?a? ??p??a µ??f? a???????s?? t??
ep??tas??. ??t? µp??e? ?a e??a? ?e???t??? (p?,
e?t?µ?s? p???p????t?ta? t?? ??s?? ??p) ?/?a? ?a
pe???aµί??e? µ?a µ???? ???p???s?, e?te (ί) ?a
???p???sete ??a ??a??p???t??? ??µµ?t? t?? ??????.
Ta pa?ad?sete ??a ????? p?? ?a ??e? t?? µ??f?
e?e???t???? e??as?a? (?a d????? ?d???e?).
?p?s??, ?a d?sete µ?a de?te?? pa???s?as? st?
µ???µa a?t? t? f??? t?? e??as?a sa? (?a d?????
?d???e?).
7
Term Projects
  • ????a ??a t?? ???as?e? ??p?? ?I
  • Security E. Sit and R. Morris Security
    Considerations for Peer-to-Peer Distributed Hash
    Tables. IPTPS 2002 261-269 D. S. Wallach A
    Survey of Peer-to-Peer Security Issues. ISSS
    2002 42-57
  • Incentives M. Feldman, K. Lai, I. Stoica and J.
    Chuang Robust incentive techniques for
    peer-to-peer networks. ACM Conference on
    Electronic Commerce 2004 102-111
  • Trust/Reputation S. D. Kamvar, M. T. Schlosser,
    H. Garcia-Molina The Eigentrust algorithm for
    reputation management in P2P networks. WWW 2003
    640-651
  • Publish/subscribe M. Bender, S. Michel, S.
    Parkitny, and G. Weikum A Comparative Study of
    Pub/Sub Methods in Structured P2P Networks.
    DBISP2P 2006, Seoul, South Korea, Springer, 2006

8
Term Projects
??G?S?? ????? ?I? Ta
ep????ete ??a ap? ta s?st?µata p?? af?????
????sµ??? s?st?µ?t?? ?µ?t?µ?? ??µί??. Ta p??pe?
?a e??atast?sete t? s?et??? ????sµ??? ?a? ?a
?atas?e??sete µ?a µ???? efa?µ???. Ta pa?ad?sete
??a ????? p?? ?a pe???aµί??e? ??a s??t?µ?
e??e???d?? ??a t? s?st?µa ?a? µ?a pe????af? t??
efa?µ??? sa?. ?p?s??, ?a pa???s??sete t??
e??as?a sa? st? µ???µa (?a d????? ?d???e?). ?
pa???s?as? ?a p??pe? ?a pe???aµί??e? ?a? ??a
s??t?µ? demo.
9
Term Projects
?a S?st?µata ??a t?? ???as?e? ??p?? I?I 1
OpenDHT OpenDHT is a publicly accessible
distributed hash table (DHT) service. 2 P2
Declarative Networking P2 is a system which uses
a high-level declarative language to express
overlay networks in a highly compact and reusable
form 3 PeerSim PeerSim is a simulation
environment for P2P protocols in java.
10
Term Projects
????esµ?e? ?e? 7 S??µat?sµ?? ?µ?d?? ?a?
ep????? e??as?a? ?e? 14 1-2 se??de? "p??tas?
e??as?a?" (project proposal) (?a d?d??? ?d???e?)
?e? 21 p??a??? ?a ????µe µ?a µ????
pa???s?as?/s???t?s? t?? e??as??? t?? te?e?ta?a
eίd?µ?da p??? ta ???st???e??a ?a? 11
?a???s??se?? ?????? ?µ?da? ?? ?a? 18 "
" ?a? 25 ?a??d?s? ???as?a? (??a t? ?????, ?a
d????? ?d???e?) Ta ?p???e? ??a te???? workshop
p?? ?a pa???s?ast??? ?? e??as?e? ???? t?? ?µ?d??.
11
Agenda ??a s?µe?a
  • ?e????af? t?? e??as??? t?? µa??µat??
  • 2. Ge???? ??a Replication
  • 3. Replication Theory for Unstructured (Cohen et
    al paper)
  • 4. Epidemic Algorithms for Updates (Demers et al
    paper)

12
Types of Replication
  • Two types of replication
  • Metadata/Index replicate index entries
  • Data/Document replication replicate the actual
    data (e.g., music files)
  • Metadata vs Data
  • () Lighter storage and bandwidth wise
  • () Sizes of replicated objects more uniform
  • (-) Adds an extra hop for actually getting the
    data
  • (-) More frequent updates
  • (-) Less durability/availability

13
Types of Replication
Caching vs Replication Cache Store data
retrieved from a previous request
(client-initiated) Replication More proactive,
a copy of a data item may be stored at a node
even if the node has not requested it
14
Reasons for Replication
  • Reasons for replication
  • Performance
  • load balancing
  • locality place copies close to the requestor
  • geographic locality (more choices for the next
    step in search)
  • reduce number of hops
  • Availability
  • In case of failures
  • Peer departures

15
Reasons for Replication
Besides storage, cost associated with
replication Consistency Maintenance Make reads
faster in the expense of slower writes
16
  • No proactive replication (Gnutella)
  • Hosts store and serve only what they requested
  • A copy can be found only by probing a host with a
    copy
  • Proactive replication of keys ( meta data
    pointer) for search efficiency (FastTrack, DHTs)
  • Proactive replication of copies for search
    and download efficiency, anonymity. (Freenet)

17
Issues
Which items (data/metadata) to replicate Based
on popularity In traditional distributed systems,
also rate of read/write cost benefit the
ratio read-savings/write-increase Where to
replicate (allocation schema)
18
Issues
How/When to update Both data items and metadata
19
Database-Flavored Replication Control Protocols
Lets assume the existence of a data item x with
copies x1, x2, , xn x logical data item xis
physical data items
A replication control protocol is responsible for
mapping each read/write on a logical data item
(R(x)/W(x)) to a set of read/writes on a
(possibly) proper subset of the physical data
item copies of x
20
One Copy Serializability
Correctness A DBMS for a replicated database
should behave like a DBMS managing a one-copy
(i.e., non-replicated) database insofar as users
can tell
One-copy schedule replace operation of data
copies with operations on data items
One-copy serializable (1SR) the schedule of
transactions on a replicated database be
equivalent to a serial execution of those
transactions on a one-copy database
21
ROWA
Read One/Write All (ROWA) A replication control
protocol that maps each read to only one copy of
the item and each write to a set of writes on all
physical data item copies.
Even if one of the copies is unavailable an
update transaction cannot terminate
22
Write-All-Available
Write-all-available A replication control
protocol that maps each read to only one copy of
the item and each write to a set of writes on all
available physical data item copies.
23
Quorum-Based Voting
  • Read quorum Vr and a write quorum Vw to read or
    write a data item
  • If a given data item has a total of V votes, the
    quorums have to obey the following rules
  • Vr Vw gt V
  • Vw gt V/2

Rule 1 ensures that a data item is not read or
written by two transactions concurrently
(R/W) Rule 2 ensures that two write operations
from two transactions cannot occur concurrently
on the same data item (W/W)
24
Distributing Writes
Immediate writes Deffered writes Access only one
copy of the data item, it delays the distribution
of writes to other sites until the transaction
has terminated and is ready to commit. It
maintains an intention list of deferred
updates After the transaction terminates, it send
the appropriate portion of the intention list to
each site that contains replicated
copies Optimizations aborts cost less may
delay commitment delays the detection of
conflicts Primary or master copy Updates at a
single copy per item
25
Eager vs Lazy Replication
Eager replication keeps all replicas
synchronized by updating all replicas in a single
transaction Lazy replication asynchronously
propagate replica updates to other nodes after
the replicating transaction commits
In p2p, lazy replication (or soft state)
26
Update Propagation
  • Stateless or State-full (the item-owners know
    which nodes holds copies of the item)
  • Who initiates the update
  • Push by the server item (copy) that changes
  • Pull by the client holding the copy

27
Update Propagation
  • When
  • Periodic
  • Immediate
  • Lazy when an inconsistency is detected
  • Threshold-based Freshness (e.g., number of
    updates or actual time)
  • Value
  • Expiration-Time Items expire (become invalid)
    after that time (most often used in p2p)
  • Adaptive periodic
  • Reduce or increase period based on the updates
    seen between two successive updates
  • Stateless or State-full (the item-owners know
    which nodes holds copies of the item)

28
Summary Design parameters and performance (CAN)
Path-length Neighbor state Total path latency Per-hop latency volume Multiple routes replicas
Dimensions (d) O(dn1/d) O(d) ? - - ? -
Realities (r) ? O(r) ? - O(r) ? O(r)
MAXPEERS (p) O(1/p) O(p) ? ? O(p) ? O(p)
Hash functions (k) - - ? - ?(k) - O(k)
RTT-weighted routing - - ? ? - - -
Uniform partitioning heuristic Reduced variance Reduces variance - - Reduced variance - -
Only on replicated data
29
CHORD Failures
  • Replication
  • Each node maintain a successor list of its r
    nearest successors
  • Upon failure, use the next successor in the list
  • Modify stabilize to fix the list

Other nodes may attempt to send requests through
the failed node Use alternate nodes found in the
routing table of preceding nodes or in the
successor list
30
CHORD Failures
  • Theorem If we use a successor list with r
    ?(logN) in an initially stable network and then
    every node fails with probability 1/2, then
  • with high probability, find_successor returns
    the closest living successor
  • the expected time to execute find_successor in
    the failed network is O(logN)

A lookup fails, if all r nodes in the successor
list fail. All fail with probability 2-r
(independent failures) 1/N
31
CHORD Replication
Store replicas of a key at the k nodes succeeding
the key Successor list helps to keep the number
of replicas per item known Other approach store
a copy per region
32
BATON Failures
There is routing redundancy
  • Upon node departure or failure, the parent can
    reconstruct the entries
  • Assume node x fails, any detected failures of x
    are reported to its parent y
  • y regenerates the routing tables of x Theorem 2
  • Messages are routed
  • Sideways (redundancy similar to CHORD)
  • Up-down (can find its parent through its
    neighbors)

33
Replication - Beehive
  • Proactive model-driven replication
  • Passive (demand-driven) replication such as
    caching objects along a lookup path
  • Hint for BATON
  • Beehive
  • The length of the average query path reduced by
    one when an object is proactively replicated at
    all nodes logically preceding that node on all
    query paths
  • BATON
  • Range queries
  • Many paths to data

Any ideas?
34
Agenda ??a s?µe?a
1. ?e????af? t?? e??as??? t?? µa??µat?? 2. Ge????
??a Replication 3. Replication Theory for
Unstructured (Cohen et al paper) 4. Epidemic
Algorithms for Updates (Demers et al paper)
35
Replication Theory Replica Allocation Policies
in Unstructured P2P Systems
E. Cohen and S. Shenker, Replication Strategies
in Unstructured Peer-to-Peer Networks. SIGCOMM
2002 Q. Lv et al, Search and Replication in
Unstructured Peer-to-Peer Networks, ICS02
Replication Part
36
Replication Allocation Scheme
Question how to use replication to improve
search efficiency in unstructured networks?
How many copies of each object so that the
search overhead for the object is minimized,
assuming that the total amount of storage for
objects in the network is fixed
37
Replication Theory - Model
Assume m objects and n nodes Each node capacity
?, total capacity R n ? How to allocate R
among the m objects? Determine ri number of
copies (distinct nodes) that hold a copy of i S
i1, m ri R (R total capacity) Also, pi ri/R
Fraction of total capacity allocated to
I Allocation represented by the vector (p1, p2,
. pm) (r1/R, r2/R, rm/R)
38
Replication Theory - Model
Assume that object i is requested with relative
rates qi, we normalize it by setting S i1, m qi
1 For convenience, assume 1 ltlt ri ? n and that
q1 ? q2 ? ? qm
Map the query distribution q to an allocation
vector p
39
Replication Theory - Model
Assume all nodes equal capacity ?, ? R/n
R ? m (at least one copy per item) m gt ? (else,
the problem is trivial, maintain copies of all
items everywhere)
Bounds for pi At least one copy, ri ? 1, Lower
value l 1/R At most n copies, ri ? n, Upper
value, u n/R
40
Replication Theory
Assume that searches go on until a copy is
found We want to determine ri that minimizes the
average search size (number of nodes probed) to
locate an item i Need to compute average search
size per item Searches consist of randomly
probing sites until the desired object is found
search at each step draws a node uniformly at
random and asks whether it has a copy
41
Search Example
  • 2 probes

4 probes
42
Replication Theory
The probability Pr(k) that the object I is found
at the kth probe is given Pr(k) Pr(not
found in the previous k-1 probes) Pr(found in one
(the kth) probe) (1 ri/n)k-1 ri/n k
(search size step at which the item is found) is
a random variable with geometric distribution and
? ri/n gt expectation n/ri
43
Replication Theory
Ai Expectation (average search size) for object
i is the inverse of the fraction of sites that
have replicas of the object Ai n/ri The
average search size A of all the objects (average
number of nodes probed per object query) A Si
qi Ai n Si qi/ri
Minimize A n Si qi/ri
44
Replication Theory
If we have no limit on ri, replicate everything
everywhere Then, the average search size Ai
n/ri 1 Search becomes trivial
Assume a limit on R and that the average number
of replicas per site ? R/n is fixed
How to allocate these R replicas among the m
objects how many replicas per object
45
Replication Theory
Minimize Si qi/pi Subject to Spi 1 and l ? pi
? u
Monotonicity Since q1 ? q2 ? ? qm, we must
have p1 ? p2 ? ? pm More copies to more
popular, but how many?
46
Uniform Replication
Create the same number of replicas for each
object ri R/m Average search size for uniform
replication Ai n/ri m/? Auniform Si qi m/?
m/? (m n/R) Which is independent of the query
distribution
47
Proportional Replication
It makes sense to allocate more copies to objects
that are frequently queried, this should reduce
the search size for the more popular objects
Create a number of replicas for each object
proportional to the query rate ri R qi
48
Proportional Replication
Number of replicas for each object ri R
qi Average search size for uniform
replication Ai n/ri n/R qi Aproportioanl Si
qi n/R qi m/? Auniform again independent of
the query distribution Why? Objects whose query
rate are greater than average (gt1/m) do better
with proportional, and the other do better with
uniform The weighted average balances out to be
the same
49
Uniform and Proportional Replication
  • Summary
  • Uniform Allocation pi 1/m
  • Simple, resources are divided equally
  • Proportional Allocation pi qi
  • Fair, resources per item proportional to demand
  • Reflects current P2P practices

50
Space of Possible Allocations
So what is the optimal way to allocate replicas
so that A is minimized?
  • q i1/q i ? p i1/p i
  • As the query rate decreases, how much does the
    ratio of allocated replicas behave
  • Reasonable
  • p i1/p i ? 1
  • 1 for uniform

51
Space of Possible Allocations
  • Definition Allocation p1, p2, p3,, pm is
    in-between Uniform and Proportional if
  • for 1lt i ltm, q i1/q i lt p i1/p i lt 1
  • (1 for uniform, for proportial, we want to
    favor popular but not too much)
  • Theorem1 All (strictly) in-between strategies
    are (strictly) better than Uniform and
    Proportional

Theorem2 p is worse than Uniform/Proportional
if for all i, p i1/p i gt 1 (more popular gets
less) OR for all i, q i1/q i gt p i1/p i (less
popular gets less than fair share)
Proportional and Uniform are the worst
reasonable strategies
52
Space of allocations on 2 items
Uniform
Proportional
p2/p1
q2/q1
53
So, what is the best strategy?
54
Square-Root Replication
Find ri that minimizes A, A Si qi Ai n Si
qi/ri This is done for ri ? vqi where ? R/Si
vqi Then the average search size is Aoptimal
1/? (Si vqi)2
55
How much can we gain by using SR ?
Zipf-like query rates
Auniform/ASR
56
Other Metrics Discussion
  • Utilization rate, the rate of requests that a
    replica of an object i receives
  • Ui R qi/ri
  • For uniform replication,
  • all objects have the same average search size,
  • but replicas have utilization rates proportional
    to their query rates
  • Proportional replication achieves perfect load
    balancing with all replicas having the same
    utilization rate,
  • but average search sizes vary with more popular
    objects having smaller average search sizes than
    less popular ones

57
Replication Summary
58
Pareto Distribution (for the queries)
59
Pareto Distribution (for the queries)
Both model Power-law distributions Zipf what is
the size (popularity) of the r-th ranked -- y 
 r-b Pareto how many have size gt r (look at
the frequency distribution) PX gt x  x-k PX
x  x-(k1) x-a "The r-th hottest item has n
queries" is equivalent to saying "r items have n
or more queries". This is exactly the definition
of the Pareto distribution, except the x and y
axes are flipped. Whereas for Zipf, we have r
(rank) and compute n, in Pareto we have n and
compute r (rank) Reference http//www.hpl.hp.com
/research/idl/papers/ranking/ranking.html
60
Replication (summary)
Each object i is replicated on ri nodes and the
total number of objects stored is R, that is S
i1, m ri R
  • Uniform All objects are replicated at the same
    number of nodes
  • ri R/m
  • (2) Proportional The replication of an object is
    proportional to the query probability of the
    object
  • ri ? qi
  • (3) Square-root The replication of an object i
    is proportional to the square root of its query
    probability qi
  • ri ? vqi

61
Assumption that there is at least one copy per
object
  • Query is soluble if there are sufficiently many
    copies of the item.
  • Query is insoluble if item is rare or non
    existent.
  • What is the search size of a query ?
  • Soluble queries number of probes until answer is
    found.
  • Insoluble queries maximum search size

62
  • SR is best for soluble queries
  • Uniform minimizes cost of insoluble queries

What is the optimal strategy?
63
104 items, Zipf-like w1.5
All Soluble
85 Soluble
All Insoluble
Uniform
SR
64
We now know what we need.
How do we get there?
65
Replication Algorithms
  • Uniform and Proportional are easy
  • Uniform When item is created, replicate its key
    in a fixed number of hosts.
  • Proportional for each query, replicate the key
    in a fixed number of hosts (need to know or
    estimate the query rate)

Desired properties of algorithm
  • Fully distributed where peers communicate through
    random probes minimal bookkeeping and no more
    communication than what is needed for search.
  • Converge to/obtain SR allocation when query rates
    remain steady.

66
Replication - Implementation
Two strategies are popular Owner
Replication When a search is successful, the
object is stored at the requestor node only (used
in Gnutella) Path Replication When a search
succeeds, the object is stored at all nodes along
the path from the requestor node to the provider
node (used in Freenet) Following the reverse path
back to the requestor
67
Achieving Square-Root Replication
  • How can we achieve square-root replication in
    practice?
  • Assume that each query keeps track of the search
    size
  • Each time a query is finished the object is
    copied to a number of sites proportional to the
    number of probes
  • On average object i will be replicated on c n/ri
    times each time a query is issued (for some
    constant c)
  • It can be shown that this gives square root

68
Replication - Conclusion
Thus, for Square-root replication an object
should be replicated at a number of nodes that
is proportional to the number of probes that the
search required
69
Replication - Implementation
If a p2p system uses k-walkers, the number of
nodes between the requestor and the provider node
is 1/k of the total nodes visited (number of
probes) Then, path replication should result in
square-root replication Problem Tends to
replicate nodes that are topologically along the
same path
70
Replication - Implementation
Random Replication When a search succeeds, we
count the number of nodes on the path between the
requestor and the provider Say p Then, randomly
pick p of the nodes that the k walkers visited to
replicate the object Harder to implement
71
Achieving Square-Root Replication
What about replica deletion? Steady state
creation time equal with the deletion time The
lifetime of replicas must be independent of
object identity or query rate FIFO or random
deletions is ok LRU or LFU no
72
Replication Evaluation
  • Study the three replication strategies in the
    Random graph network topology
  • Simulation Details
  • Place the m distinct objects randomly into the
    network
  • Query generator generates queries according to a
    Poisson process at 5 queries/sec
  • Zipf-distribution of queries among the m objects
    (with a 1.2)
  • For each query, the initiator is chosen randomly
  • Then a 32-walker random walk with state keeping
    and checking every 4 steps
  • Each sites stores at most objAllow (40) objects
  • Random Deletion
  • Warm-up period of 10,000 secs
  • Snapshots every 2,000 query chunks

73
Replication Evaluation
  • For each replication strategy
  • What kind of replication ratio distribution does
    the strategy generate?
  • What is the average number of messages per node
    in a system using the strategy
  • What is the distribution of number of hops in a
    system using the strategy

74
Evaluation Replication Ratio
Both path and random replication generates
replication ratios quite close to square-root of
query rates
75
Evaluation Messages
Path replication and random replication reduces
the overall message traffic by a factor of 3 to 4
76
Evaluation Hops
Much of the traffic reduction comes from reducing
the number of hops
Path and random, better than owner For example,
queries that finish with 4 hops, 71 owner, 86
path, 91 random
77
Summary
  • Random Search/replication Model probes to
    random hosts
  • Proportional allocation current practice
  • Uniform allocation best for insoluble queries
  • Soluble queries
  • Proportional and Uniform allocations are two
    extremes with same average performance
  • Square-Root allocation minimizes Average Search
    Size
  • OPT (all queries) lies between SR and Uniform
  • SR/OPT allocation can be realized by simple
    algorithms.

78
Discussion
Cohen et al paper Path replication overshoots or
undershoot the fixed point if queries arrive in
large bursts or time between search and
subsequent copy generator is large more
involved algorithms than path replication Extensi
ons for variable size issues or nodes with
heterogeneous capacities Many issues Other
types of graphs, adaptability, etc
79
Agenda ??a s?µe?a
1. ?e????af? t?? e??as??? t?? µa??µat?? 2. Ge????
??a Replication 3. Replication Theory for
Unstructured (Cohen et al paper) 4. Epidemic
Algorithms for Updates (Demers et al paper)
80
Replication Unstructured P2Pepidemic
algorithms
81
  • Replication Policy
  • How many copies
  • Where (owner, path, random path)
  • Update Policy
  • Synchronous vs Asynchronous
  • Master Copy

82
Methods for spreading updates Push originate
from the site where the update appeared To reach
the sites that hold copies Pull the sites
holding copies contact the master site Expiration
times Epidemics for spreading updates
83
A. Demers et al, Epidemic Algorithms for
Replicated Database Maintenance, SOSP 87
Update at a single site Randomized algorithms
for distributing updates and driving replicas
towards consistency Ensure that the effect of
every update is eventually reflected to all
replicas Sites become fully consistent only when
all updating activity has stopped and the system
has become quiescent Analogous to epidemics
84
Methods for spreading updates Direct mail each
new update is immediately mailed from its
originating site to all other sites () Timely
reasonably efficient (-) Not all sites know all
other sites (stateless) (-) Mails may be
lost Anti-entropy every site regularly chooses
another site at random and by exchanging content
resolves any differences between them ()
Extremely reliable but requires exchanging
content and resolving updates (-) Propagates
updates much more slowly than direct mail
85
  • Methods for spreading updates
  • Rumor mongering
  • Sites are initially ignorant when a site
    receives a new update it becomes a hot rumor
  • While a site holds a hot rumor, it periodically
    chooses another site at random and ensures that
    the other site has seen the update
  • When a site has tried to share a hot rumor with
    too many sites that have already seen it, the
    site stops treating the rumor as hot and retains
    the update without propagating it further
  • Rumor cycles can be more frequent that
    anti-entropy cycles, because they require fewer
    resources at each site, but there is a chance
    that an update will not reach all sites

86
  • Anti-entropy and rumor spreading are examples of
    epidemic algorithms
  • Three types of sites
  • Infective A site that holds an update that is
    willing to share is hold
  • Susceptible A site that has not yet received an
    update
  • Removed A site that has received an update but
    is no longer willing to share
  • Anti-entropy simple epidemic where all sites are
    always either infective or susceptible

87
A set S of n sites, each storing a copy of a
database The database copy at site s ? S is a
time varying partial function s.ValueOf K ?
uV x t T set of keys set of values
set of timestamps (totally ordered by lt V
contains the element NIL s.ValueOfk NIL, t
item with k has been deleted from the
database Assume, just one item s.ValueOf ? uV
x tT thus, an ordered pair consisting of a
value and a timestamp The first component may be
NIL indicating that the item was deleted by the
time indicated by the second component
88
  • The goal of the update distribution process is to
    drive the system towards
  • s, s ?S s.ValueOf s.ValueOf
  • Operation invoked to update the database
  • UpdateuV s.ValueOf r, Now)

89
Direct Mail
At the site s where an update occurs For each
s ? S PostMailtos, msg(Update, s.ValueOf)
s originator of the update s receiver of the
update
Each site s receiving the update message
(Update, (u, t)) If s.ValueOf.t lt t
s.ValueOf ? (u, t)
  • The complete set S must be known to s (stateful
    server)
  • PostMail messages are queued so that the server
    is not delayed (asynchronous), but may fail when
    queues overflow or their destination are
    inaccessible for a long time
  • n (number of sites) messages per update
  • traffic proportional to n and the average
    distance between sites

90
Anti-Entropy
At each site s periodically execute For some s
? S ResolveDifferences, s
s pushes its value to s
s ? s
Three ways to execute ResolveDifference Push
(sender (server) - driven) If s.Valueof.t gt
s.Valueof.t s.ValueOf ? s.ValueOf Pull
(receiver (client) driven) If s.Valueof.t lt
s.Valueof.t s.ValueOf ? s.ValueOf Push-Pull
s.Valueof.t gt s.Valueof.t ? s.ValueOf ?
s.ValueOf s.Valueof.t lt s.Valueof.t ? s.ValueOf
? s.ValueOf
s pulls s and gets s value
91
Anti-Entropy
  • Assume that
  • Site s is chosen uniformly at random from the
    set S
  • Each site executes the anti-entropy algorithm
    once per period
  • It can be proved that
  • An update will eventually infect the entire
    population
  • Starting from a single affected site, this can
    be achieved in time proportional to the log of
    the population size

92
Anti-Entropy
Let pi be the probability of a site remaining
susceptible (has not received the update) after
the i cycle of anti-entropy For pull, A site
remains susceptible after the i1 cycle, if (a)
it was susceptible after the i cycle and (b) it
contacted a susceptible site in the i1
cycle pi1 (pi)2 For push, A site remains
susceptible after the i1 cycle, if (a) it was
susceptible after the i cycle and (b) no
infectious site choose to contact in the i1
cycle pi1 pi (1 1/n)n(1-pi)
1 1/n (site is not contacted by a node) n(1-pi)
number of infectious nodes at cycle i
Pull is preferable than push
93
Anti-Entropy
More next week
Write a Comment
User Comments (0)
About PowerShow.com