Consistency and Replication - PowerPoint PPT Presentation

About This Presentation
Title:

Consistency and Replication

Description:

Consistency and Replication By Deepa Jandhyala Deepak Chinavle Local-Write Protocols (II) Primary-backup protocol in which the primary migrates to the process ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 39
Provided by: DeepaJa
Category:

less

Transcript and Presenter's Notes

Title: Consistency and Replication


1
Consistency and Replication
By Deepa Jandhyala Deepak Chinavle
2
Introduction
  • In Distributed Systems data is replicated to
    improve performance and enhance reliability.
  • Replication leads to consistency problems between
    copies.
  • How do we achieve consistency of replicated data
    while multiple processes are accessing the data?
  • We will look at some consistency models followed
    by some replica management techniques.

3
Replication
  • Reasons for Replication
  • Increase Reliability
  • Continue working after one replica crashes.
  • Multiple copies provides better protection
    against corrupted data.
  • Safeguard against single failing write operation
    by considering the value that is returned by at
    least two copies as being the correct one.
  • 2) Improve Performance
  • Scaling in numbers.
  • When too many processes are accessing one server,
    performance can be improved by replicating the
    server and dividing the work.
  • Scaling with respect to size of geographical
    area.
  • Placing a copy of data in the proximity of the
    process using it decreases access latency.
  • Price of Replication Consistency problems

4
Consistency Issues
  • Tight Consistency - all copies of replicated data
    needs to be consistent at all times
  • Updates performed as single atomic operation.
  • Leads to scalability problems across large
    networks
  • Data needs to be synchronized.
  • Each copy needs to reach agreement on when to
    perform update locally.
  • Global Synchronization needed to keep all
    replicas consistent
  • Leads to high performance costs.
  • Solution Loosen consistency constraints
  • Avoid global synchronization and gain performance.

5
Consistency Models
  • A contract between processes and the distributed
    data store (collection of shared data accessible
    to clients) concerning read and write operations
    to the data.
  • If processes obey certain rules then data store
    will work correctly.
  • A process that performs a read operation on a
    data item expects to see the last write operation
    on that data.
  • Each model effectively restricts the values that
    a read operation on a data item can return
  • Models with major restrictions are easier to use
    but dont perform as well as models with minor
    restrictions.

6
Types of Consistency Models
  • Data-Centric Consistency Models
  • Systemwide consistent view on a data store where
    concurrent processes can simultaneously update
    the data store.
  • Continuous
  • Sequential
  • Causal
  • Entry
  • The general organization of a logical data store,
    physically distributed and replicated across
    multiple processes.

7
Strict Consistency
  • Strongest consistency model?
  • Any read on a data item X returns a value
    corresponding to the result of the most recent
    write on X
  • Need an absolute global time
  • most recent needs to be unambiguous
  • this behavior can be observed in uniprocessors
  • a7 a13 print(a) has to print 13 as
    output
  • Suppose, 2 processors are a few meters apart
  • B has a copy of X, A sends request to read X at
    T1, B writes it at T2. If T2-T1 is greater than
    the time it takes to propagate the request, then
    due to the laws of Physics, it is not possible
    for A to get the updated value
  • Clearly, strict consistency is hard!

8
Continuous Consistency
Can be measured along three dimensions based on
how much inconsistency the applications can
tolerate - deviation in numerical values -
deviation in staleness - deviation with respect
to the ordering of update operations To define
inconsistencies we can define a conit conit
specifies the unit over which consistency is to
be measured.
9
Continuous Consistency - Example of a Conit
keeping track of consistency deviations
10
Choosing the appropriate granularity for a conit.
Two updates lead to update propagation.
No update propagation is needed (yet).
11
Linearizability and Sequential Consistency
  • Strict consistency is the ideal model
  • but impossible to implement!
  • Often times such strict consistency is not
    needed
  • Sequential consistency
  • Lamport (1979)?
  • slightly weaker than strict consistency
  • defined by Lamport for shared memory for
    multi-processors
  • Definition The result of any execution is the
    same as if the (read and write) operations by all
    processes on the data store were executed in some
    sequential order and the operations of each
    individual process appear in this sequence in the
    order specified by its program
  • Definition means when processes are running
    concurrently
  • interleaving of read and write operations is
    acceptable, but all processes see the same
    interleaving of operations
  • Difference from strict consistency
  • no reference to the most recent time
  • absolute global time does not play a role

12
Sequential Consistency
  • A sequentially consistent data store. (P3 and P4
    see the same order)
  • A data store that is not sequentially consistent.
    (P3 and P4 dont see the same order of events)?
  • Note, it doesnt matter, when the events actually
    took place
  • It does matter if all processes see them in the
    same order

13
Linearizability and Sequential Consistency
  • Three concurrently executing processes.
  • Three variables are stored in shared sequentially
    consistent data store
  • Each variable is initialized to 0
  • Assignment corresponds to a write operation
  • Various interleaved execution sequences are
    possible
  • How many?
  • Are all of them sequentially valid?

14
Linearizability and Sequential Consistency
  • Four valid execution sequences for the processes
    of the previous slide. The vertical axis is time.
  • Signature output from P1, P2 and P3 as a string
  • Not all 64 (26) patterns are allowed
  • 000000 (print statements ran before
    assignments!)?
  • 001001 is also not possible (why?)?

15
Causal Consistency
  • Necessary condition Writes that are potentially
    causally related must be seen by all processes in
    the same order. Concurrent writes may be seen in
    a different order on different machines.
  • Weaker than sequential consistency
  • If event B is caused or influence by an earlier
    event A, causality requires that everyone first
    see A and then B
  • Concurrent operations that are not causally
    related

16
Causal Consistency (1)
  • This sequence is allowed with a
    causally-consistent store, but not with
    sequentially or strictly consistent store.
  • W(x)b and W(x)c are concurrent
  • so all processes dont see them in the same
    order
  • P3 and P4 read the values a and b in order as
    they are potentially causally related. No
    causality for the value c
  • This is not sequentially consistent though
  • as P3 and P4 see the values in different order

17
Causal Consistency (2)?
  • A violation of a casually-consistent store (W(x)b
    is potentially dependent on W(x)a (causally
    related)?
  • A correct sequence of events in a
    casually-consistent store.(as P2 does not read
    the value of a before its write

18
Entry Consistency
Conditions - An acquire access of a
synchronization variable is not allowed to
perform with respect to a process until all
updates to the guarded shared data have been
performed with respect to that process. -
Before an exclusive mode access to a
synchronization variable by a process is allowed
to perform with respect to that process, no other
process may hold the synchronization variable,
not even in nonexclusive mode. - After an
exclusive mode access to a synchronization
variable has been performed, any other process's
next nonexclusive mode access to that
synchronization variable may not be performed
until it has performed with respect to that
variable's owner.
19
Types of Consistency Models
  • Client-Centric Consistency Models
  • Consistency for a single client with no
    guarantees concerning concurrent accesses by
    different clients
  • Monotonic-Reads
  • Monotonic-Writes
  • Read-Your-Writes
  • Write-Follow-Reads
  • Examples
  • DNS
  • Single naming authority per zone
  • lazy propagation of updates
  • WWW
  • No write-write conflicts
  • Usually acceptable to serve slightly out-of-date
    pages from a cache

20
Eventual Consistency
  • The principle of a mobile user accessing
    different replicas of a distributed database.

If no updates take place for some time, all
replicas gradually converge to a consistent
state
21
Notations for client-centric models
  • xit version of object x at local copy Li at
    time t
  • result of updates to a series of writes since
    system initialization at Li
  • WS(xit) series of writes
  • WS(xit2 xjt2) series of writes that have
    also been performed at copy Lj at a later time
  • Assume an owner for each data item
  • avoid write-write conflicts
  • Monotonic reads
  • Monotonic writes
  • Read-your-values
  • Writes-follow-reads

22
Monotonic Reads
If a process has seen a value of x at time t, it
will never see an older value at a later time.
WS(x1) is part of WS(x2)?
  • Example
  • replicated mailboxes with
  • on-demand propagation
  • of updates
  • The read operations performed by a single process
    P at two different local copies of the same data
    store.
  • A monotonic-read consistent data store (a)?
  • A data store that does not provide monotonic
    reads (b)?

23
Monotonic Writes
If an update is made to a copy, all preceding
updates must have been completed first.
A write may affect only part of the state of a
data item
FIFO propagation of updates by each process
Example - s/w library
No guarantee that x at L2 has the same value as
x at L1 at the time W(x1) completed
  • The write operations performed by a single
    process P at two different local copies of the
    same data store
  • A monotonic-write consistent data store.
  • A data store that does not provide
    monotonic-write consistency.

24
Read Your Writes
A write is completed before a successive read, no
matter where the read takes place
  • Negative examples
  • updates of Web pages
  • changes of passwords

The effects of the previous write at L1 have not
yet been propagated !
  1. A data store that provides read-your-writes
    consistency.
  2. A data store that does not.

25
Writes Follow Reads
Any successive write will be performed on a copy
that is up-to-date with the value most recently
read by the process.
  • Example
  • updates of a newsgroup
  • Responses are visible only after
  • the original posting has been received
  1. A writes-follow-reads consistent data store
  2. A data store that does not provide
    writes-follow-reads consistency

26
Replica Placement (I)?
  • The logical organization of different kinds of
    copies of a data store into three concentric
    rings.

27
Replica Placement (II)?
  • Permanent copies
  • Basis of distributed data store
  • Example from the Web
  • Anycasting round-robin clusters
  • Mirror sites
  • Server-initiated
  • Push caches
  • Dynamic replication to handle bursts
  • Read-only
  • Content Distribution Network (CDN)?
  • Client-initiated
  • Improve access time to data
  • Danger of stale data
  • Private vs Shared caches

28
Server-Initiated Replicas
  • Counting access requests from different clients.

CntQ(P, F)?
P closest server for both C1 C2
  • At each server
  • Count of accesses
  • for each file
  • Originating clients

Routing DB to determine closest server for
client C
  • Deletion threshold del(S, F)?
  • Replication threshold rep(S, F)

Dynamic decisions to delete/migrate/replicate
file F to server S
Extra care to ensure that at least one copy
remains !
29
Update propagation
  • State vs Operations
  • Notification of an update
  • Invalidation protocols
  • Best for low read/write ratio ()?
  • Transfer data from one copy to another
  • Transfer of actual data or log of changes
  • Batching
  • Best for relatively high read/write
  • Propagate the update to other copies
  • Active replication
  • Pull vs Push
  • Push ? replicas maintain a high degree of
    consistency
  • Updates are expected to be of use to multiple
    readers
  • Pull ? best for low read/write
  • Hybrid scheme based on lease model
  • Unicast vs Multicast
  • Push ? multicast group
  • Pull ? single server or client requests an update

30
Pull versus Push Protocols
  • Comparison between push-based pull-based
    protocols in the case of multiple client, single
    server systems.

31
Remote-Write Protocols (I)?
  • Primary-based remote-write protocol with a fixed
    server to which all read write operations are
    forwarded.

32
Remote-Write Protocols (II)?
  • The principle of primary-backup protocol.

33
Local-Write Protocols (I)?
Keeping track of each data items current
location ?
  • Primary-based local-write protocol in which a
    single copy is migrated between processes.

34
Local-Write Protocols (II)?
Suitable for disconnected operation
  • Primary-backup protocol in which the primary
    migrates to the process wanting to perform an
    update.

35
Active Replication (I)?
  • The problem of replicated invocations.

36
Active Replication (II)?
(a) Forwarding an invocation request from a
replicated object. (b) Returning a reply to a
replicated object.
37
Quorum-Based Protocols
  • Three examples of the voting algorithm
  • A correct choice of read write set
  • A choice that may lead to write-write conflicts
  • A correct choice, known as ROWA (read one, write
    all)?

38
References
  • Distributed Systems, Principles and paradigms
    Andrew S. Tenebaum, Maarten Van Steen
  • Data Consistency in Intermittently Connected
    Distributed Systems Evaggelia Pitoura, Bharat
    Bhargava, Ouri Wolfson
  • Maintaining Consistency of Data in Mobile
    Distributed Environments - Evaggelia Pitoura,
    Bharat Bhargava
Write a Comment
User Comments (0)
About PowerShow.com