Title: Consistency And Replication
1Chapter 10
- Consistency And Replication
2Topics
- Motivation
- Data-centric consistency models
- Client-centric consistency models
- Distribution protocols
- Consistency protocols
3Motivation
- Make copies of services on multiple sites,
improve - Reliability(by redundancy)
- If primary FS crashes, standby FS still works
- Performance
- Increase processing power
- Reduce communication delays
- Scalability
- Prevent overloading a single server (size
scalability) - Avoid communication latencies (geographic scale)
- However, updates are more complex
- When, who, where and how to propagate the updates?
4Concurrency Control on Remote Object
- A remote object capable of handling concurrent
invocations on its own. - A remote object for which an object adapter is
required to handle concurrent invocations
5Object Replication
- A distributed system for replication-aware
distributed objects. - A distributed system responsible for replica
management
6Distributed Data Store
Clients point of view Its data store is capable
of storing an amount of data
7Distributed Data Store
Data Stores point of view General organization
of a logical data store, physically distributed
and replicated across multiple tasks.
8Operations on A Data Store
- Readri(x)b client i or process Pi performs a
read for data item x and it returns value b - Write wi(x)a client i or process Pi performs a
write on data x setting it to the new value a - Operations not instantaneous
- Time of issue (when request is sent by client)
- Time of execution (when request is executed at a
replica) - Time of completion (when reply is received by
client)
9Example
10Consistency Models
- Defines which interleaving of operations is valid
(admissible) - Different levels of consistency
- strong (strict, tight)
- weak (loose)
- Consistency model
- Concerned with the consistency of a data store
- Specifies characteristics of valid ordering of
operations - A data store that implements a particular
consistency model will provide a total ordering
of operations that is valid according to this
model
11Consistency Models
- Data-centric models
- Described consistency experienced by all clients
- Clients P1, P2, P3, see same kind of orderings
- Client centric models
- Described consistency only seen by clients who
request it - Clients P1, P2, P3 may see different kinds of
orderings
12Data-Centric Consistency Models
- Strong ordering
- Strict consistency
- Linear consistency
- Sequential consistency
- Causal consistency
- FIFO consistency
- Weak ordering
- Weak consistency
- Release consistency
- Entry consistency
13Strict Consistency
- Definition A DDS (distributed data store) is
strict consistent if any read on a data item of
the DDS returns the value corresponding to the
result of the most recent write on x, regardless
of the location of the processes doing read or
write - Analysis
- 1. In a single processor system strict
consistency is for nothing, thats exact the
behavior of local shard memory with atomic
reads/writes - 2. However, its hard to establish a global time
to determine whats the most recent write - 3. Due to message transfer delays this model is
not achievable
14Example
- Behavior of two processes, operating on the same
data item. - (a) A strictly consistent store.
- (b) A store that is not strictly consistent.
15Strict Consistency Problems
Assumption y 0 is stored on node 2, P1 and P2
are processes on node 1 and 2, Due to message
delays, r(y) at t t2 may result in 0 or 1 and
at t t4 may result in 0, 1 or 2 Furthermore
If y migrates to node 1 between t2 and t3 then
r(y) issued at time t2 may even get value 2 (i.e.
.back to the future.).
16Sequential Consistency (1)
- Definition A DDS offers sequential consistency,
if all processes see the same order of accesses
to the DDS, whereby reads/writes of individual
processes occur in program order, and
reads/writes of different ones are performed in
some sequential order. - Analysis
- 1. Sequential consistency is weaker than strict
consistency - 2. Each valid permutation of accesses is allowed
iff all tasks see same permutation ? 2 runs of a
distributed application may have different
results - 3. No global timing ordering is required
17Example
Each task sees all writes in the same order, even
though not strict consistent.
18Non-Sequential Consistency
19Linear Consistency
- Definition A DDS is said to be linear consistent
(linearizable) when each operation is
time-stamped and the following holds The result
of each execution is the same as if the (read and
write) operations by all processes on the DDS
were executed in some sequential order and the
operations of each individual process appear in
this sequence in the order specified by its
program. In addition, if TSOP1(x) lt TSOP2(y),
then operation OP1(x) should precede OP2(y) in
this sequence
20Assumption
- Each operation is assumed to receive a time stamp
using a globally available clock, but with only
finite precision, e.g. some loosely coupled
synchronized local clocks. - Linear consistency is stricter than sequential
one, i.e. a linear consistent DDS is also
sequentially consistent. - With linear consistency no longer each valid
interleaving of reads and writes is allowed, the
ordering has also obey the order implied by the
time-stamps of these operations.
21Causal Consistency (1)
- Definition A DDS is assumed to provide causal
consistency if, the following condition holds
Writes that are potentially causally related
must be seen by all tasks in the same order.
Concurrent writes may be seen in a different
order on different machines. - If event B is caused or influenced by an
earlier event A, causality requires that everyone
else also sees first A, and then B.
22Causal Consistency (2)
- Definition write2 is potentially dependent on
write1, when there is a read between these 2
writes which may have influenced write2 - Corollary If write2 is potential dependent on
write1 ? the only correct sequence is write1
?write2.
23Causal Consistency Example
- This sequence is allowed with a
casually-consistent store, but not with
sequentially or strictly consistent store.
24Causal Consistency Example
- A violation of a casually-consistent store.
- A correct sequence of events in a
casually-consistent store.
25Implementation
- Implementing causal consistency requires keeping
track of which processes have seen which writes. - Construction and maintenance of a dependency
graph, expressing which operations are causally
related (using vector time stamps)
26FIFO or PRAM Consistency
- Definition DDS implements FIFO consistency, when
all writes of one process are seen in the same
order by all other processes, i.e. they are
received by all other processes in the order they
were issued. However, writes from different
processes may be seen in a different order by
different processes. - Corollary Writes on different processors are
concurrent - Implementation Tag each write-operation of every
process with (PID, sequence number)
27Example
Both writes are seen on processes P3 and P4 in a
different order, they still obey
FIFO-consistency, but not causal consistency
because write 2 is dependent on write1.
28Example (2)
Two concurrent processes with variable x,y 0
Process P1 Process P2 x1 y1 if ( y0
) print(A) if (x0) print(B)
- Possible results
- A
- B
- Nil
- AB?
29Synchronization Variable
- Background not necessary to propagate
intermediate writes. - Synchronization variable
- Associated with one operation synchronize(S).
- Synchronize all local copies of the data store.
30Compilation Optimization
int a, b, c, d, e, x, y / variables /int
p, q / pointers /int f( int p, int
q) / function prototype / a x
x / a stored in register /b y
y / b as well /c aaa bb a
b / used later /d a a c / used
later /p a / p gets address of a /q
b / q gets address of b /e f(p,
q) / function call /
- A program fragment in which some variables may be
kept in registers.
31Weak Consistency
- Definition DDS implements weak consistency, if
the following hold - Accesses to synchronization variables obey
sequential consistency - No access to a synchronization variable is
allowed to be performed until all previous writes
have completed everywhere - No data access (read or write) is allowed to be
performed until all previous accesses to
synchronization variables have been performed
32Interpretation
- A synchronization variable S knows just one
operation synchronize(S) responsible for all
local replicas of the data store - Whenever a process calls synchronize(S) its local
updates will be updated on all replicas of the
DDS and all updates of the other processes will
be updated to its local replica of the DDS - All tasks see all accesses to synchronization-vari
ables in the same order
33Interpretation (2)
- No data access allowed until all previous
accesses to synchronization-variables have been
done - By doing a synch before reading shared data, a
task can be sure of getting the up to date
value - Unlike previous consistency models weak
consistency forces the programmer to collect
critical operations all together
34Example
Via synchronization you can enforce that youll
get up-to-date values. Each process must
synchronize if its writes should be seen by
others. A process requesting a read without any
synchronization measures may get out-of-date
values.
35Non-weak Consistency
36Release Consistency
- Problems with weak consistency When a
synch-variable is accessed, the DDS doesnt know
whether this is done because a process has
finished writing the shared variables or whether
it is about reading them. - It must take actions required in both cases,
namely making sure that all locally initiated
writes have been completed (i.e. propagated to
all other machines), as well as gathering in all
writes from other machines. - Provide two operations acquire and release
37Details
- Idea
- Distinguish between memory accesses in front of a
critical section (acquire) and those behind of a
critical section (release). - Implementation
- When a release is done, all the protected data
that have been updated within the critical
section will be propagated to all replicas.
38Definition
- Definition A DDS offers release consistency, if
the following three conditions hold - 1. Before a read or write operation on shared
data is performed, all previous acquires done by
the process must have completed successfully. - 2. Before a release is allowed to be performed,
all previous reads and writes by the process must
have been completed - 3. Accesses to synchronization variables are FIFO
consistent.
39Example
Valid event sequence for release consistency,
even though P3 missed to use acquire and
release. Remark Acquire is more than a lock or
enter_critical_section, it waits until all
updates on protected data from other nodes are
propagated to its local replicas, before it
enters the critical section
40Lazy Release Consistency
- Problems with eager release consistency When a
release is done, the process doing the release
pushes out all the modified data to all processes
that already have a copy and thus might
potentially read them in the future. - There is no way to tell if all the target
machines will ever use any of these updated
values in the future ? above solution is a bit
inefficient, too much overhead.
41Details
- With lazy release consistency nothing is done
at a release. - However, at the next acquire the processor
determines whether it already has all the data it
needs. Only when it needs updated data, it needs
to send messages to those places where the data
have been changed in the past. - Time-stamps help to decide whether a data is
out-dated.
42Entry Consistency
- Unlike release consistency, entry consistency
requires each ordinary shared variable to be
protected by a synchronization variable. - When an acquire is done on a synchronization
variable, only those ordinary shared variables
guarded by that synchronization variable are made
consistent. - A list of shared variables may be assigned to a
synchronization variable (to reduce overhead).
43How to Synchronize?
- Every synch-variable has a current owner
- An owner may enter and leave critical sections
protected by this synchronization variable as
often as needed without sending any coordination
message to the others. - A process wanting to get a synchronization-variabl
e has to send a message to the current owner. - The current owner hands over the synch-variable
all together with all updated values of its
previous writes. - Multiple reads in the non-exclusive reads are
possible.
44Example
A valid event sequence for entry consistency
45Summary of Consistency Models
Consistency Description
Strict Absolute time ordering of all shared accesses matters.
Linearizability All processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp
Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time
Causal All processes see causally-related shared accesses in the same order.
FIFO All processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order
(a)
Consistency Description
Weak Shared data can be counted on to be consistent only after a synchronization is done
Release Shared data are made consistent when a critical region is exited
Entry Shared data pertaining to a critical region are made consistent when a critical region is entered.
(b)
- Consistency models not using synchronization
operations. - Models with synchronization operations.
46Up to Now
- System wide consistent view on DDS
- Independent of number of involved processes
- Mutual exclusive atomic operations on DDS
- Processes access only local copies
- Propagation of updates have to be made, whenever
it is necessary to fulfill requirements of the
consistency model - Are there still weaker consistency models?
47Client-Centric Consistency
- Provide guarantees about ordering of operations
only for a single client, i.e. - Effects of an operations depend on the client
performing it - Effects also depend on the history of clients
operations - Applied only when requested by the client
- No guarantees concerning concurrent accesses by
different clients - Assumption
- Clients can access different replicas, e.g.
mobile users
48Mobile Users
- The principle of a mobile user accessing
different replicas of a distributed database.
49Eventual Consistency
- If updates do not occur for a long period of
time, all replicas will gradually become
consistent - Requirements
- Few read/write conflicts
- No write/write conflicts
- Clients can accept temporary inconsistency
- Examples
- DNS
- No write/write conflicts
- Updates slowly (1 2 days) propagating to all
caches. - WWW
- Few write/write conflicts
- Mirrors eventually updated
- Cached copies (browser or Proxy) eventually
replaced.
50Client Centric Consistency Models
- Monotonic Reads
- Monotonic Writes
- Read Your Writes
- Writes Follow Reads
51Monotonic Reading
- Definition A DDS provides monotonic-read
consistency if the following holds - If process P reads the value of data item x, any
successive read operation on x by that process
will always return the same value or a more
recent one (independently of the replica at
location L where this new read will be done).
52Example Systems
- Distributed e-mail database with distributed and
replicated user-mailboxes. - Emails can be inserted at any location.
- However, updates are propagated in a lazy (i.e.
on demand) fashion.
53Example
- The read operations performed by a single process
P at two different local copies of the same data
store. - A monotonic-read consistent data store
- A data store that does not provide monotonic
reads.
54Monotonic Writing
- Definition DDS provides monotonic-write
consistency if the following holds - A write operation by process P on data item x is
completed before any successive write operation
on x by the same process P can take place. - Remark Monotonic-writing FIFO consistency
- Only applies to writes from one client process P
- Different clients -not requiring monotonic
writing may see the writes of process P in any
order
55Example
- The write operations performed by a single
process P at two different local copies of the
same data store - A monotonic-write consistent data store.
- A data store that does not provide
monotonic-write consistency.
56Reading Your Writes
- Definition DDS provides read your write
consistency if the following holds - The effect of a write operation by a process P on
a data item x at a location L will always be seen
by a successive read operation by the same
process. - Example of a missing read-your-write consistency
- Updating a website with an editor, if you want to
view your updated website, you have to refresh
it, otherwise the browser uses the old cached
website content. - Updating passwords
57Example
- A data store that provides read-your-writes
consistency. - A data store that does not.
58Writes Following Reads
- Definition DDS provides writes-follow-reads
consistency if the following holds - A write operation by a process P on a data item x
following a previous read by the same process, is
guaranteed to take place on the same or even a
more recent value of x, than the one having been
read before.
59Example
- A writes-follow-reads consistent data store
- A data store that does not provide
writes-follow-reads consistency
60Implementing Client Centric Consistency
- Naive Implementation (ignoring performance)
- Each write gets a globally unique identifier
- Identifier is assigned by the server that accepts
this write operation for the first time - For each client two sets of write identifiers are
maintained - Read-set(client C) RS(C)
- write-IDs relevant for the reads of this client
C - Write-set(client C) WS(C)
- write-IDs having been performed by client C
61Implementing Monotonic Reads
When a client C performs a read at server S, that
server is handed the clients read set RS(C) to
control whether all identified writes have taken
place locally at server S. If not, server has to
be updated before reading!
62Implementing Monotonic Write
- If client initiates a write on a server S, this
server S gets the clients write-set in order to
update server S. A write on this server is done
according to the times stamped WID. - Having done the new write, clients write-set is
updated with this new write. The response time of
a client might thus increase with an ever
increasing write-set. - However, what to do if all the reader write-sets
of a client get larger and larger?
63Improving Efficiency with RS and WS
- Major drawback potential sizes of read- and
write sets ? - Group all write- and read-operations of a client
in a so called session (mostly assigned with an
application) - Every time a client closes its current session,
all updates are propagated and these sets are
deleted afterwards
64Summary on Consistency Models
- Choosing the right consistency model requires an
analysis of the following trade-offs - Consistency and redundancy
- All replicas must be consistent
- All replicas must contain full state
- Reduced consistency ? reduced reliability
- Consistency and performance
- Consistency requires extra work
- Consistency requires extra communication
- May result in loss of overall performance
65Distribution Protocols
- Replica Placement
- Permanent Replicas
- Server-Initiated Replicas
- Client-Initiated Replicas
- Update Propagation
- State versus Operations
- Pull versus Push Protocols
- Unicasting versus Multicasting
- Epidemic Protocols
- Update Propagation Models
- Removing data
66Replica Placement
- The logical organization of different kinds of
copies of a data store into three concentric
rings.
67Replica Placement
- Permanent replicas
- Initial set of replicas. Created and maintained
by DDS-owner(s) - Writes are allowed
- E.g., web mirrors
- Server-initiated replicas
- Enhance performance
- Not maintained by owner of DDS
- Placed close to groups of clients
- Manually
- Dynamically
- Client-initiated replicas
- Client caches
- Temporary
- Owner not aware of replica
- Placed closest to a client
- Maintained by host (often the client)
68Update Propagation
69What to Be Propagated?
- Propagate only a notification of an update
(invalidation) - Typical for invalidation protocols
- May include information which part of the DDS has
been updated - Work best, when ratio of reads/write is low
- Propagate updated data from one replica to
another - Work best, if ratio of reads/writes is high
- You may also aggregate some update before sending
them across the network - Propagate the update operation to other replicas
(active replication) - This approach called active replication works if
the size of parameters associated with each
operation is small compared to the updated data
70Pull versus Push Protocols
- Push protocol , i.e. updates are propagated to
other replicas without those replicas having
asked for them - Used between permanent and server initiated
replicas, i.e. to achieve a relatively high
degree of consistence - Pull protocol , i.e. a server (or a client) asks
another server to provide the updates - Used by client caches, e.g. when a client
requests a website, not having updated for a
longer period of time, it may check the original
web site, whether updates have been made - Efficient when read-to-write ratio is relatively
low.
71Pull versus Push Protocols
Issue Push-based Pull-based
State of server List of client replicas and caches None
Messages sent Update (and possibly fetch update later) Poll and update
Response time at client Immediate (or fetch-update time) Fetch-update time
- A comparison between push-based and pull-based
protocols in the case of multiple client, single
server systems.
72Unicasting
Potential overhead with unicasting in a LAN. Good
for pull-based approach.
73Multicasting
With multicasting an update message can be
propagated more efficiently across a LAN. Good
for push-based approach.
74Epidemic Protocols
- Implementing eventual consistency you may rely on
epidemic protocols. - No guarantees for absolute consistency are given,
but after some time epidemic protocols will send
updates to all replicas. - Notions
- An infective is a server with a replica that is
willingly to spread to other servers, too - A susceptible, is a server that has not yet been
infected, i.e. updated - A removed server is a server, that does not want
to propagate any information
75Anti-Entropy Protocol
- Server P picks another server Q at random, and
subsequently exchanges updates with Q, there are
3 approaches how to exchange updates - P only pushed its own updates to Q
- P only pulls in new updates from Q
- P and Q exchange to each other their updates,
i.e. a push-pull approach
76Gossip Protocols
- Rumor spreading or gossiping works as follows
- If server P has been updated for data item x, it
contacts another arbitrary server Q and tries to
push its new update of x to Q. - However, if Q got this update already by some
other server, P is so much disappointed, that it
will stop gossiping with a prob. 1/k
77Gossip Protocols (2)
- Although gossiping really works quite well on
average, you cannot guarantee that every server
will be updated. - In a DDS with a large number of replicas, the
fraction s of servers remaining ignorant towards
an update, i.e. are still susceptible is - s e-(k1)(1-s)
78Analysis of Epidemic Protocols
- Advantages
- Scalability, due to limited of update messages
- Disadvantage
- Spreading the deleting of a data is quite
cumbersome, due to an unwanted side effect - Suppose, you have deleted on server S data item
x, but you may receive again an old copy of data
item x from some other server due to still
ongoing gossiping
79Consistency Protocols
- Primary-Based Protocols
- Remote-Write Protocols
- Local-Write Protocols
- Replicated-Write protocols
- Active Replication
- Quorum-Based Protocols
80Primary-Based Protocols
- Each data item of a DDS has an associated
primary, responsible for coordinating write
operations on x - Primary server
- Fixed,i.e. a specific remote server, i.e. remote
writing - Dynamic, primary is migrated to the place, of the
next write
81Remote-Write Protocols (1)
- Primary-based remote-write protocol with a fixed
server to which all read and write operations are
forwarded.
82Remote-Write Protocols (2)
- The principle of primary-backup protocol.
83Local-Write Protocols (1)
- Primary-based local-write protocol in which a
single copy is migrated between processes.
84Local-Write Protocols (2)
- Primary-backup protocol in which the primary
migrates to the process wanting to perform an
update.
85Replicated-Write Protocols
- Writes can take place at multiple replicas,
instead of on only a specific primary server. - Active replication
- Operation is forwarded to all replicas
- Problem
- make sure all operations need to be carried out
in the same order everywhere. - Scalability
- Replicated invocation
- Majority voting
- Before reading or writing ask a subset of all
replicas
86Replicated Invocation for Active Replication
87Solutions
- Forwarding an invocation request from a
replicated object. - Returning a reply to a replicated object.
88Quorum-Based Protocols
- Preliminaries
- If a client wants to read or write, it first must
request and acquire permission of multiple
servers. - Example
- A DFS with file F being replicated on N servers.
If an update has to be made, demand, that the
client first contacts half of the servers plus 1,
and get them to agree to do his update. Once,
they have agreed, file F gets a new version
number F(x.y) - To read file F, a client also must contact at
least half of the servers and ask them, to hand
out the current version number of F.
89Giffords Quorum-Based Protocols
- To read a file F a client must use a read-quorum,
an arbitrary assemble of NR servers. - To write a file F, at least NW servers( the write
quorum) is required. The following must hold - A) NR NW gt N
- B) NW gt N/2
- A) Is used to prevent read-write conflicts
- B) Is used to prevent write-write conflicts
90Examples
- Three examples of the voting algorithm
- A correct choice of read and write set
- A choice that may lead to write-write conflicts
- A correct choice, known as ROWA (read one, write
all)