Optimistic replication for Internet data services - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Optimistic replication for Internet data services

Description:

1. Optimistic replication for Internet data services. Yasushi Saito. Hank Levy ... Simple and lightweight algorithm suitable for cluster-based Internet data services ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 26
Provided by: Yasu6
Category:

less

Transcript and Presenter's Notes

Title: Optimistic replication for Internet data services


1
Optimistic replication for Internet data services
  • Yasushi Saito
  • Hank Levy

http//porcupine.cs.washington.edu/
University of Washington Department of Computer
Science and Engineering, Seattle, WA, U.S.A.
2
Overview
  • Simple and lightweight algorithm suitable for
    cluster-based Internet data services
  • Dynamic replica addition/deletion.
  • Ensures eventual consistency of replicas.
  • Completely decentralized.
  • Tolerates multiple node failures, partitions,
    etc.
  • Is space- and cost- efficient.
  • Implemented on Porcupine scalable email server

3
Outline
  • Motivation
  • Examples
  • Correctness
  • Practical Issues
  • Performance
  • Conclusion

4
Motivation
The Internet
  • Porcupine cluster-based mail server
  • Manageability, availability, and performance via
    homogeneous architecture and dynamic data
    distribution
  • Other applications BBS, Web, Calendar

Naming load balancing service
...
5
Goals and Non-goals
  • Goals
  • Dynamic addition/removal of replicas
  • Space- and computational- efficiency
  • Fault tolerance
  • Simplicity
  • Non-goals
  • Single-copy consistency (its Internet, anyway)

6
Why a new algorithm?
  • PC-based clusters present a new environment.
  • Prior art focused on two extreme environments
    mainframeLAN, laptopmodem
  • Single-copy algorithms are not available enough
  • Mobile replication algorithms are not optimized
    for mostly-connected environments.
  • Very few algorithm allows addition/deletion of
    replicas

7
Algorithm Overview
  • Contents-pushing (cf. Usenet, MS Active
    Directory)
  • ? Computational efficiency
  • Two-phase protocol (Apply, Retire)
  • ? Space efficiency
  • Unified treatment of contents updates and replica
    addition/deletion
  • Thomas write rule node discovery to resolve
    conflicting updates
  • ? Simplicity fault tolerance

8
Outline
  • Motivation
  • Examples
  • Updating contents
  • Adding and deleting replicas
  • Resolving conflicting updates
  • Correctness
  • Practical Issues
  • Performance
  • Conclusion

9
Example Updating contents
Object contents
Replica set
A
B
C
A
B
C
Timestamp
C
Ack set
310pm
A
A
A
B
C
Update record (exists only during update
propagation)
B
10
Example Update Propagation
A
B
C
A
B
C
310pm
310pm
A
A
C
A
B
C
A
C
310pm
A
B
B
11
Update Retirement
Retire 310pm
A
B
C
A
B
C
310pm
310pm
A
B
C
A
C
A
B
C
A
C
310pm
A
B
Retire 310pm
B
12
Example Final State
  • Algorithm quiescent after update retirement
  • New contents absent from the update record
  • Contents are read from replica directly
  • Update stored only during propagation
  • Computational space efficiency

A
A
B
C
B
A
B
C
C
A
B
C
13
Replica addition and removal
A issues an update to delete C
A
B
A
B
C
B
  • Unified treatment of updates to contents and
    to replica-set.

310pm
A
B
A
B
C
A
B
C
C
A
A
New replica set
Target set
Ack set
14
What if updates conflict?
  • Thomas write rule
  • Newest update always wins.
  • Older update canceled by overwriting by the newer
    update.
  • Same rule applied to replica addition/deletion.
  • But some subtleties...

15
Update conflict resolution
C
D
  • A adds C, B adds D simultaneously
  • B must discover C and let C delete the replica
    contents

A
B
C
D
A
B
310pm
320pm
A
B
A
B
C
D
A
B
C
A
B
D
A
B
A
B
Target set
Ack set
New replica set
16
Node discovery protocol
A
B
C
D
C
A
B
D
310pm
320pm
Apply 320 update
A
B
A
B
C
D
A
B
C
A
B
D
C
Add targets C
A
B
A
B
17
Proof of Correctness
  • Claim
  • All live replicas will store the newest update,
    regardless of
  • number of concurrent updates.
  • number of replicas added or removed.
  • number of node failures.
  • when
  • nodes can discover each other at least
    indirectly
  • E.g., when partitioned, each partition will
    become consistent.

18
Outline
  • Motivation
  • Examples
  • Correctness
  • Practical Issues
  • Performance
  • Conclusion

19
Practical Issues
  • Handling long-dead nodes
  • Algorithm maintains consistency of remaining
    replicas.
  • But updates will get stuck and clog nodes
    disks.
  • Solution erase dead nodes names from replica
    sets update records after the grace period.

20
Performance Networking overhead
  • Each update sends Apply and Retire msgs.
  • Retire can be batched w/o affecting users.
  • Actual of msgs
  • ? 2(N-1).

Measured networking overhead on a fully loaded
Porcupine mail server.
21
Performance Space overhead
  • Each update is small
  • (contents are read directly from the replicas)
  • Update is deleted quickly after retirement.
  • of outstanding updates is independent of of
    objects on node

100K for update records
2G for email messages
22
Conclusion
  • Simple and lightweight algorithm suitable for
    cluster-based Internet data services
  • Contributions
  • Simple dynamic replica addition protocol
  • Node discovery for resolving concurrent updates
  • Update retirement using synchronized clocks
  • Code available at
  • http//porcupine.cs.washington.edu/

23
Potential Applications
  • This algorithm is not just for email..
  • Imagine proxies for update-intensive web sites
  • Today, they use timeout and polling
  • Dynamic replication improves availability.

Master
Proxies
24
Potential Applications
  • This algorithm is not just for email..
  • Imagine proxies for update-intensive web sites
  • Today, they use timeout and polling
  • Dynamic replication improves availability.

Master
Proxies
25
Performance Networking overhead (bytes)
  • Each network message is mostly occupied by actual
    object contents.
  • Overhead by the replication service
  • ? 6.
Write a Comment
User Comments (0)
About PowerShow.com