Replication: Synchronous and Asynchronous - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Replication: Synchronous and Asynchronous

Description:

Correctness: a replicated database should behave like a one-copy database in so ... Correctness of replicated objects ... has the correctness criterion: ... – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 35
Provided by: rond9
Category:

less

Transcript and Presenter's Notes

Title: Replication: Synchronous and Asynchronous


1
Replication Synchronous and Asynchronous
  • Amr El Abbadi
  • Department of Computer Science
  • University of California
  • Santa Barbara, CA 93106

2
Organization
  • The basic replication model BHG87
  • Serializability theory for replicated databases
  • Replica control protocols
  • quorums
  • available copies
  • view-based replication
  • Asynchronous replication
  • Wuu and Bernstein--the epidemic model.

3
Why Replicate Data?
  • Application semantics (domain servers, routing
    info, etc).
  • Fault-tolerance (banks, information, etc)
  • Performance (search engines, parallel
    applications, etc)

4
The Synchronous approach
  • Correctness a replicated database should behave
    like a one-copy database in so far as the users
    can tell.
  • Model Each object x is implemented by a set of
    copies x1, x2, x3, that reside on different
    sites s1, s2, s3, .

5
Simple Approach
  • Read one/write all protocol.
  • readx is translated to read of any copy xa.
  • Write x is translated to write of all copies
    xa,xb,..
  • any correct concurrency control protocol.
  • What if failures happen? No write operations!

6
Write all available copies
  • Consider the following history
  • w0xa
    w1xa
  • w0xb r2xb Fail(b)
  • w0yc r1yc
    w2yc
  • Since t2 read-x-from t0, order must be t0 t2 t1
  • But t1 reads-y-from t0, order must be t0 t1 t2
    !!!!!!!
  • SG is also acyclic
  • t0
    t2

  • t1

7
Correctness of replicated objects
  • One-copy equivalence The different copies of
    the object must appear has a single copy.
  • Serializability the concurrent execution of a
    set of transactions must be equivalent to a
    serial execution.
  • One-copy serializability the concurrent
    execution of a set of transactions must be
    equivalent to a serial history on single copy
    objects.

8
One-Copy Serialization Graph
  • Given a history H, a 1-SGH is SGH with
    enough edges added such that
  • ? objects x, 1-SGH embodies a total order (
    ) on all transactions that write x.
  • If tj reads-x-from ti, and ti tk, then
    1-SGH contains a path from tj to tk.
  • ti
    tk

  • tj

9
Back to example
  • Recall
  • w0xa
    w1xa
  • w0xb r2xb Fail(b)
  • w0yc r1yc
    w2yc
  • SG is t0
    t2

  • t1
  • Since t1 reads-y-from t0, and t0 t2,
    then t1 t2
  • But t2 read-x-from t0, and t0 t1,
    then t2 t1
  • t0
    t2


  • t1

10
Available Copies Protocol BG 83
  • Recall
  • w0xa
    w1xa
  • w0xb r2xb Fail(b)
  • w0yc r1yc
    w2yc
  • Introduce the failure of a site as an atomic
    transaction OUTb (similarly for recovery
    INb), which causes transactions to change write
    set (change directory info).
  • t0
    t2
  • We explicitly force
  • a path.
    OUTb
  • t1

11
Available copies protocol
  • Inexpensive read operations
  • Tolerates site failures
  • - Does NOT tolerate partitioning failures!
  • P1
    P2

12
Quorum Consensus Protocol Gifford 79
  • Extend the idea of quorums for mutual exclusion
    to read and write operations, i.e., read and
    write quorums.
  • read write
    write write
  • quorum quorum
    quorum quorum

13
Quorum Consensus Protocol
  • Associate with each copy a version number.
  • Write operation
  • Determine max-version-no of a write quorum
  • update write quorum with new value and version
    numbers to max-version-no 1
  • Read operation
  • read value of copy with max-version-no in read
    quorum.
  • Use a correct concurrency control protocol.

14
Correctness
  • The SG(h) for any execution created by the quorum
    consensus protocol is
  • Acyclic correct concurrency control protocol
  • 1-SG(h) all conflicting operations conflict on
    a copy
  • (1) SG(h) has a total order on all write
    operations,
  • (2) SG(h) orders all read and write conflicts.

15
Quorum Consensus Protocol
  • No special treatment for failures and
    recovery.
  • Tolerates both site and partitioning
    failures
  • - Expensive read operations.
  • - Large number of copies to tolerate a given
    number of failures, e.g., 3 copies to tolerate 1
    failure 5 copies to tolerate 2 failures, etc.

16
Virtual partitions ProtocolEl Abbadi et al. 85,
86
  • Quorums can tolerate partitions
  • Available copies allows read-one.
  • We want to combine the best of both worlds!
  • Use quorums to decide when to execute an
    operation
  • Use read-one write-all-available for actual
    execution.

17
Views
  • We associate with each site s, view(s), which is
    the set of sites s assumes it can communicate
    with.
  • Ideally

b
a,c
a
b
a,c
a,c
c
18
Virtual Partitions Rule
  • Accessibility Rule A transaction executes only
    if a majority of sites are in its view.
  • Read/write Rule read one copy, write all copies
    in view.


b
b,c
a,b
a,b,c
c
a
19
Virtual Partitions Protocol
  • Communication Rule Only sites with the same
    view are allowed to communicate.
  • Each new view has associated with it a view-id.
  • View Changes
  • The initiating site s decides on the members of
    the new view, and picks a view-id greater than
    any previous one.
  • s then executes an update transaction to update
    all copies in view with most up to date value for
    each object.
  • Update transaction accesses all copies of object
    with a majority of sites in new view.
  • A site participates in new update transaction
    only if local view-id is less than proposed
    view-id.

20
Correctness idea
  • Global correctness
  • majority rule
  • Local correctness
  • read-one write all
  • correct concurrency control protocol

21
Virtual partitions Protocol
  • Tolerates partitions and site failures
  • Allows read one rule.
  • - Costly update transaction

22
Asynchronous or Lazy replication
  • In large internet type of settings,
    transaction-based replication is
  • too expensive (remember 2PC).
  • Unrealistic (all sites are not up all the time)
  • does not scale (large number of sites)
  • Epidemic approach Bayou project at Xerox
  • information is changed locally, and then
    propagated in a lazy manner to all other
    replicas.
  • Correctness is based on causality.

23
Replicated dictionary problem
  • Efficient solutions to the replicated log and
    dictionary problems. Wuu and Bernstein PODC 84.
  • Basic assumptions
  • sites may crash, links may fail, partitioning.
  • Each site maintains a local clock (a counter).
  • Local events are atomic.
  • Use Lamports event execution model and
    happens-before relation.

24
The log problem
  • Each site maintains a copy of the log.
  • The log contains local events, i.e.,
  • insert
  • delete
  • The goal of the algorithm is to keep all copies
    of the log up to date.
  • Li is the copy of the log at site i.
  • L(e) is the contents of log Lnode(e) immediately
    after event e is executed.

25
The log problem
  • Log Problem find an algorithm that maintains
    the log such that given an execution ltE, gt,
  • ? events e,f if f e then f is in L(e)
  • General approach
  • For each local event, insert a record in the
    local log.
  • Exchange logs to update other sites.
  • Main question when to exchange logs? With
    application communication to capture the happens
    before relation.

26
Solutions to the log problem
  • A solution
  • Site i sends to site j all records in the log
    that were inserted since i last sent a message to
    j.
  • WHY INCORRECT?
  • Another solution
  • each site i includes Li with each message.
  • On receiving a message, a site j incorporates all
    new event records.
  • BAD
  • Entire log sent with each message
  • Entire log kept at each node.

27
Efficient solution for log problem
  • Observation 1 Once i knows that j knows of
    an event e (which may have occurred on site k),
    then i does not need to include event e in
    message sent to j.
  • Observation 2 Once i knows that all sites
    know about an event e, then i does not need to
    keep a record of e in its local log.

28
2 Dimensional Time-Table
  • TTin,n
  • if TTij,k t, then site i knows that site j
    has learned of all events that occurred at site k
    up to time t.

k
j
t
29
The 2 dimensional timetable
  • Notes
  • site j might actually know about more events, but
    site i may not be aware of it.
  • TTii,i is the value of clock at site i.
  • TTii,k is the value of clock at site k of the
    most recent event at site k that site i is aware
    of.

30
Two dimensional timetable
  • Let hasrec(TTi, e, k) be true iff
  • TTik,node(e) gt time(e)
  • The algorithm must guarantee that if hasrec(TTi,
    e, k) is true, then site k has learned of event
    e.
  • Note site i need not send a record of event e
    to site k if hasrec(TTi, e, k) is true.

31
Log maintenance
  • Initialize all entries in TT to 0.
  • For each local operation, insert a copy in the
    local log.
  • With each send operation from site i to site k
    piggyback TT the following subset of the local
    log Li all records e such that hasrec(TTi, e, k)
    is not true.
  • On receipt of a message from site k by site i
  • incorporate all new events into local log
  • update TT
  • Max of times in local ith row and remote kth
    row.
  • Max of all elements.

32
Dictionary problem
  • Assume we want to maintain a replicated
    dictionary with insert, delete and lookup
    operations.
  • On receipt of a message with a partial log and
    TT
  • Update local copy of the dictionary
  • Update local copy of TT as before
  • Garbage collect local log from any records that
    correspond to events e such that
  • ? site j such that hasrec(TTi, e, j) is not true

33
Asynchronous replication
  • Tolerates message loss, failures and
    partitioning.
  • Maintains causality has the correctness
    criterion
  • if e f and a site is aware of f, then is is
    aware of e
  • Extensions for transaction semantics SAE97
  • Various proposal to expand semantics to other
    applications, e.g. the Bayou project.

34
Where is the future?
  • Does it belong to the strict atomic approach--it
    does ensure secure and predictable behavior
  • Or does it belong to the lazy propagation
    approach, which is more scalable and flexible?
  • A hybrid approach?
Write a Comment
User Comments (0)
About PowerShow.com