Consistency and Replication - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Consistency and Replication

Description:

Propagate only a notification of an update. Transfer data from one ... Also called a ... Rajamony, Weimin Yu, Willy Zwaenepoel, IEEE Computer, 29 ... – PowerPoint PPT presentation

Number of Views:14

Avg rating:3.0/5.0

Slides: 27

Provided by: steve1836

Learn more at: http://www.csce.uark.edu

Category:

more less

Transcript and Presenter's Notes

Title: Consistency and Replication

1
Consistency and Replication

Chapter 6

2
Update Propagation

Three possibilities
Propagate only a notification of an update
Transfer data from one copy to another
Propagate the update operation to other copies

3
Pull versus Push Protocols

Push-based approach
Also called a server-based protocol
Server initiates the transfer without the client
asking for it
Pull-based approach
Transfer happens when the client asks for it
Advantages depend on the type of workload
Amount of data, frequency of update, frequency of
read-only operations

4
Pull versus Push Protocols
Issue Push-based Pull-based
State of server List of client replicas and caches None
Messages sent Update (and possibly fetch update later) Poll and update
Response time at client Immediate (or fetch-update time) Fetch-update time

A comparison between push-based and pull-based
protocols in the case of multiple client, single
server systems.

5
Lease Protocols

Have the copy expire after a period of time
A client can ask to renew a lease
A short lease can be given for a client than only
uses the item infrequently the server doesnt
have to maintain state for as long
Have to worry about different clocks!
Fast client, slow server
Fast server, slow client

6
Lease Protocols

Three kinds of leases
Age-based leases given out on data items
depending on the last time the item was modified
for long-lasting data, reduce number of update
messages
Renewal-frequency based the client can receive
an update to its cached copy often
State-space overhead the server lowers the
lease time as it becomes overloaded, thus
reducing the amount of state information it has
to maintain
In all of the cases updates are pushed by the
server as long as the lease has not expired.

7
Epidemic protocols

Goal is to propagate replicas in as few messages
as possible based on the model of infectious
diseases!
A server is infective if it holds an update that
it is willing to spread to other servers
A server that has not been updated yet is
susceptible
An updated server not willing or able to spread
its update is said to be removed

8
Epidemic protocols

A popular propagation model is that of
anti-entropy
A server P picks another Q at random and
exchanges updates. Choices include
P only pushes its own update to Q
Not as rapid for spreading updates
P only pulls in new updates from Q
Works best when many servers are infective
P and Q send updates to each other
An initial push to several servers helps spread
A variant is gossiping if a server tried to
spread a rumor to a server that already knows it,
it become removed with probability 1/k
See the papers on Spinglass for more information!

9
Remote-Write Protocols (1)

Primary-based remote-write protocol with a fixed
server to which all read and write operations are
forwarded.

10
Remote-Write Protocols (2)

The principle of primary-backup protocol.

11
Local-Write Protocols (1)

Primary-based local-write protocol in which a
single copy is migrated between processes.

12
Local-Write Protocols (2)

Primary-backup protocol in which the primary
migrates to the process wanting to perform an
update.

13
Quorum-Based Protocols

Three examples of the voting algorithm
A correct choice of read and write set
A choice that may lead to write-write conflicts
A correct choice, known as ROWA (read one, write
all)

14
TreadMarks

TreadMarks Shared Memory Computing on Networks
of Workstations, by Christiana Amza, Alan L. Cox,
Sandhya Dwarkadas, Pete Keleher, Honghui Lu,
Ramakrishnan Rajamony, Weimin Yu, Willy
Zwaenepoel, IEEE Computer, 29(2), 1996.
Rice University early 1990's
Network of workstations
Goal - distributed shared memory

15
TreadMarks

Runs at the user level in Unix
Uses many techniques to reduce the communication
overhead.
Lazy release consistency
A multiple writer protocol
An API allows programs to create shared variables
and to call synchronization primitives

16
Synchronization Primitives

Two kinds
Simple barrier Tmk_barrier()
Mutex lock
Tmk_lock_acquire()
Tmk_lock_release()

17
Example program Jacobi decomposition

an application to solve partial differential
equations.
y1 -x1 - x2 .... x3 ...
y2 x1 x2 .... x3 .....
Use a grid (matrix) to estimate differentiation.
The initial configuration is Generation 1. You
move to the next generation by modifying the
first. Each position is modified according to its
neighbor's values.
Look at the guy above, below, to the right and to
the left and calculate their numeric average.
When new results have been computed, put them in
the temporary scratch grid, then swap grids and
keep going until 'youre close enough'.

18
Example program Jacobi decomposition

If you have a really large grid, split the grid
in half, assign p0 to one half, p1 to the other.
Each computes their results, then they stop and
exchange rows. Then go again.
Each process do
compute my portion
Tmk_barrier()
get new grid
Tmk_barrier()
// make sure everyone has fully copied to their
new grid
until done // until you're satisfy with your
results.

19
TreadMarks view of memory

At each processor
Some portion of physical memory that is mapped to
global shared memory
Local memory (including cache)
Kernel (OS memory)

20
TreadMarks compared to IVY

IVY is a DSM system that uses sequential
consistency and virtual memory on each
workstation
Memory is stored in pages
Invalidations are sent out before writes to
shared memory
The next time this data item is accessed in IVY a
page fault will be issued

21
TreadMarks compared to IVY

For example, say processor 1 gets a page fault
when it tries to access a page in its global
virtual memory (the page is not there)
This causes an interrupt to the OS
A network message is sent to processor 2 global
shared memory to get the page
The mechanism then copies the page to some cache
location of processor 1 local physical resources

22
Other problems with IVY

False sharing
Since memory is shared in units of a page, more
than one process may write to the same page (but
not to the same location)
There is a lot of overhead and communication for
each DSM access
Context switch to OS kernel
Network messages
Interrupt processing when new page arrives

23
TreadMarks

To reduce communication and overhead
Use lazy consistency
only communicate the data when it is requested
Operate in user space
Avoid overhead for context switches
Adds responsibility to programmer, since the
programmer must be aware of the use of shared
memory
All synchronization must be done with the
TreadMark primitives

24
TreadMarks

To reduce cost of false sharing use a multiple
writer protocol
most systems use single writer protocol
In this protocol, the writer owns the page and no
one else can write to the page
Blocking to wait for access (causes delay in the
application that may not have to be there)
With multiple writer protocol you wait to
communicate updates until synchronization occurs.
(consistency traffic is deferred)
Lowers the communication costs

25
Multiple Writer Protocol

Idea
When you read you acquire a copy
When you write you make a twin (make another
copy) in system space, then you write to the
twin.

26
Multiple Writer Protocol

Suppose another process makes a request then
- compare twin with original and make a diff
file (the diff between the twin and the original)
- the diff's are then sent
- at the same time, discard the twin (since you
have a record of changes in the diff file).
Since the diff is smaller than the whole page,
the amount of communication is smaller.
Caveat is you have to use appropriate TreadMarks
synchronization tools to ensure program
correctness.