Replication: Synchronous and Asynchronous - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

Replication: Synchronous and Asynchronous

Description:

Correctness: a replicated database should behave like a one-copy database in so ... Correctness of replicated objects ... has the correctness criterion: ... – PowerPoint PPT presentation

Number of Views:182

Avg rating:3.0/5.0

Slides: 35

Provided by: rond9

Category:

more less

Transcript and Presenter's Notes

Title: Replication: Synchronous and Asynchronous

1
Replication Synchronous and Asynchronous

Amr El Abbadi
Department of Computer Science
University of California
Santa Barbara, CA 93106

2
Organization

The basic replication model BHG87
Serializability theory for replicated databases
Replica control protocols
quorums
available copies
view-based replication
Asynchronous replication
Wuu and Bernstein--the epidemic model.

3
Why Replicate Data?

Application semantics (domain servers, routing
info, etc).
Fault-tolerance (banks, information, etc)
Performance (search engines, parallel
applications, etc)

4
The Synchronous approach

Correctness a replicated database should behave
like a one-copy database in so far as the users
can tell.
Model Each object x is implemented by a set of
copies x1, x2, x3, that reside on different
sites s1, s2, s3, .

5
Simple Approach

Read one/write all protocol.
readx is translated to read of any copy xa.
Write x is translated to write of all copies
xa,xb,..
any correct concurrency control protocol.
What if failures happen? No write operations!

6
Write all available copies

Consider the following history
w0xa
w1xa
w0xb r2xb Fail(b)
w0yc r1yc
w2yc
Since t2 read-x-from t0, order must be t0 t2 t1
But t1 reads-y-from t0, order must be t0 t1 t2
!!!!!!!
SG is also acyclic
t0
t2
t1

7
Correctness of replicated objects

One-copy equivalence The different copies of
the object must appear has a single copy.
Serializability the concurrent execution of a
set of transactions must be equivalent to a
serial execution.
One-copy serializability the concurrent
execution of a set of transactions must be
equivalent to a serial history on single copy
objects.

8
One-Copy Serialization Graph

Given a history H, a 1-SGH is SGH with
enough edges added such that
? objects x, 1-SGH embodies a total order (
) on all transactions that write x.
If tj reads-x-from ti, and ti tk, then
1-SGH contains a path from tj to tk.
ti
tk
tj

9
Back to example

Recall
w0xa
w1xa
w0xb r2xb Fail(b)
w0yc r1yc
w2yc
SG is t0
t2
t1
Since t1 reads-y-from t0, and t0 t2,
then t1 t2
But t2 read-x-from t0, and t0 t1,
then t2 t1
t0
t2
t1

10
Available Copies Protocol BG 83

Recall
w0xa
w1xa
w0xb r2xb Fail(b)
w0yc r1yc
w2yc
Introduce the failure of a site as an atomic
transaction OUTb (similarly for recovery
INb), which causes transactions to change write
set (change directory info).
t0
t2
We explicitly force
a path.
OUTb
t1

11
Available copies protocol

Inexpensive read operations
Tolerates site failures
- Does NOT tolerate partitioning failures!
P1
P2

12
Quorum Consensus Protocol Gifford 79

Extend the idea of quorums for mutual exclusion
to read and write operations, i.e., read and
write quorums.
read write
write write
quorum quorum
quorum quorum

13
Quorum Consensus Protocol

Associate with each copy a version number.
Write operation
Determine max-version-no of a write quorum
update write quorum with new value and version
numbers to max-version-no 1
Read operation
read value of copy with max-version-no in read
quorum.
Use a correct concurrency control protocol.

14
Correctness

The SG(h) for any execution created by the quorum
consensus protocol is
Acyclic correct concurrency control protocol
1-SG(h) all conflicting operations conflict on
a copy
(1) SG(h) has a total order on all write
operations,
(2) SG(h) orders all read and write conflicts.

15
Quorum Consensus Protocol

No special treatment for failures and
recovery.
Tolerates both site and partitioning
failures
- Expensive read operations.
- Large number of copies to tolerate a given
number of failures, e.g., 3 copies to tolerate 1
failure 5 copies to tolerate 2 failures, etc.

16
Virtual partitions ProtocolEl Abbadi et al. 85,
86

Quorums can tolerate partitions
Available copies allows read-one.
We want to combine the best of both worlds!
Use quorums to decide when to execute an
operation
Use read-one write-all-available for actual
execution.

17
Views

We associate with each site s, view(s), which is
the set of sites s assumes it can communicate
with.
Ideally

b
a,c
a
b
a,c
a,c
c
18
Virtual Partitions Rule

Accessibility Rule A transaction executes only
if a majority of sites are in its view.
Read/write Rule read one copy, write all copies
in view.

b
b,c
a,b
a,b,c
c
a
19
Virtual Partitions Protocol

Communication Rule Only sites with the same
view are allowed to communicate.
Each new view has associated with it a view-id.
View Changes
The initiating site s decides on the members of
the new view, and picks a view-id greater than
any previous one.
s then executes an update transaction to update
all copies in view with most up to date value for
each object.
Update transaction accesses all copies of object
with a majority of sites in new view.
A site participates in new update transaction
only if local view-id is less than proposed
view-id.

20
Correctness idea

Global correctness
majority rule
Local correctness
read-one write all
correct concurrency control protocol

21
Virtual partitions Protocol

Tolerates partitions and site failures
Allows read one rule.
- Costly update transaction

22
Asynchronous or Lazy replication

In large internet type of settings,
transaction-based replication is
too expensive (remember 2PC).
Unrealistic (all sites are not up all the time)
does not scale (large number of sites)
Epidemic approach Bayou project at Xerox
information is changed locally, and then
propagated in a lazy manner to all other
replicas.
Correctness is based on causality.

23
Replicated dictionary problem

Efficient solutions to the replicated log and
dictionary problems. Wuu and Bernstein PODC 84.
Basic assumptions
sites may crash, links may fail, partitioning.
Each site maintains a local clock (a counter).
Local events are atomic.
Use Lamports event execution model and
happens-before relation.

24
The log problem

Each site maintains a copy of the log.
The log contains local events, i.e.,
insert
delete
The goal of the algorithm is to keep all copies
of the log up to date.
Li is the copy of the log at site i.
L(e) is the contents of log Lnode(e) immediately
after event e is executed.

25
The log problem

Log Problem find an algorithm that maintains
the log such that given an execution ltE, gt,
? events e,f if f e then f is in L(e)
General approach
For each local event, insert a record in the
local log.
Exchange logs to update other sites.
Main question when to exchange logs? With
application communication to capture the happens
before relation.

26
Solutions to the log problem

A solution
Site i sends to site j all records in the log
that were inserted since i last sent a message to
j.
WHY INCORRECT?
Another solution
each site i includes Li with each message.
On receiving a message, a site j incorporates all
new event records.
BAD
Entire log sent with each message
Entire log kept at each node.

27
Efficient solution for log problem

Observation 1 Once i knows that j knows of
an event e (which may have occurred on site k),
then i does not need to include event e in
message sent to j.
Observation 2 Once i knows that all sites
know about an event e, then i does not need to
keep a record of e in its local log.

28
2 Dimensional Time-Table

TTin,n
if TTij,k t, then site i knows that site j
has learned of all events that occurred at site k
up to time t.

k
j
t
29
The 2 dimensional timetable

Notes
site j might actually know about more events, but
site i may not be aware of it.
TTii,i is the value of clock at site i.
TTii,k is the value of clock at site k of the
most recent event at site k that site i is aware
of.

30
Two dimensional timetable

Let hasrec(TTi, e, k) be true iff
TTik,node(e) gt time(e)
The algorithm must guarantee that if hasrec(TTi,
e, k) is true, then site k has learned of event
e.
Note site i need not send a record of event e
to site k if hasrec(TTi, e, k) is true.

31
Log maintenance

Initialize all entries in TT to 0.
For each local operation, insert a copy in the
local log.
With each send operation from site i to site k
piggyback TT the following subset of the local
log Li all records e such that hasrec(TTi, e, k)
is not true.
On receipt of a message from site k by site i
incorporate all new events into local log
update TT
Max of times in local ith row and remote kth
row.
Max of all elements.

32
Dictionary problem

Assume we want to maintain a replicated
dictionary with insert, delete and lookup
operations.
On receipt of a message with a partial log and
TT
Update local copy of the dictionary
Update local copy of TT as before
Garbage collect local log from any records that
correspond to events e such that
? site j such that hasrec(TTi, e, j) is not true

33
Asynchronous replication

Tolerates message loss, failures and
partitioning.
Maintains causality has the correctness
criterion
if e f and a site is aware of f, then is is
aware of e
Extensions for transaction semantics SAE97
Various proposal to expand semantics to other
applications, e.g. the Bayou project.

34
Where is the future?

Does it belong to the strict atomic approach--it
does ensure secure and predictable behavior
Or does it belong to the lazy propagation
approach, which is more scalable and flexible?
A hybrid approach?

Write a Comment

User Comments (0)