Title: Synchronization of Transactions
1Synchronization of Transactions
- Matt Evett
- Computer Science Department
- Eastern Michigan University
2Synchronizing Remote Processes
- So far, weve examined synchronization of
processes (and threads) that have shared memory. - Synchronizing processes running on different
machines, each with their own memory, is more
difficult. - Consider two machines, each conducting
transactions on a database.
3Atomic Transactions
- Transaction program unit that must be executed
atomically that is, either all the operations
associated with it are executed to completion, or
none are performed. - Must preserve atomicity despite possibility of
failure. - We are concerned here with ensuring transaction
atomicity in an environment where failures result
in the loss of information on volatile storage.
4Atomic Transactions (2)
- The terminology is from the database community,
where an example transaction would be a money
transfer between two accounts. - The concurrent execution of two atomic
transactions is equivalent to their execution in
some unknown serial order. - Where possible, wed like to allow the concurrent
execution of transactions. In other words, we
want to ensure the atomicity of the transactions,
yet allow concurrency where possible.
5Storage in Distributed Systems
- Volatile storage
- Main and cache memory.
- Data here is usually lost in a crash.
- Nonvolatile storage
- Hard drives, tapes
- Data is rarely lost during a crash
- Stable storage
- Data is never lost
Getting Slower...
6Log-Based Recovery
- Write-ahead log all updates are recorded on the
log, which is kept in stable storage log has
following fields - transaction name
- data item name, old value, new value
- The log has a record of ltTi startsgt, and either
- ltTi commitsgt if the transactions commits, or
- ltTi abortsgt if the transaction aborts.
7Log-Based Recovery (2)
- Recovery algorithm uses two idempotent procedures
(their multiple execution yields same result as a
single execution) - undo(Ti) restores value of all data updated by
transaction Ti to the old values. It is invoked
if the log contains record lt Ti startsgt, but not
lt Ti commitsgt. - redo(Ti) sets value of all data updated by
transaction Ti to the new values. It is invoked
if the log contains both lt Ti startsgt and lt Ti
commitsgt.
8Checkpoints to Reduce Recovery Overhead
- Step 1. Output all log records currently residing
in volatile storage onto stable storage. - Step 2. Output all modified data residing in
volatile storage to stable storage. - Step 3. Output log record ltcheckpointgt onto
stable storage.
9Recovery routine
- Recovery routine examines log to determine the
most recent transaction Ti that started executing
before the most recent checkpoint took place. - Search log backward for first ltcheckpointgt
record. - Find subsequent lt Ti startgt record.
- redo and undo operations need to be applied to
only transaction Ti and all transactions Tk that
started executing after transaction Ti
10Concurrent Atomic Transactions
- Concurrent execution of transactions must be
equivalent to some serial schedule transactions
are executed atomically, in some order. - Example of a serial schedule in which T0 is
followed by T1 - Each transaction is composed of operations
T0 T1 read( A) write( A) read( B) write(
B) read( A) write( A) read( B) write( B)
11Conflicting Operations
- Operations Oi and Ok of transactions Ti and Tk
conflict if they access the same data item, and
at least one of these operations is a write
operation.
T0 T1 read( A) write( A) read( A) write(
A) read( B) write( B) read( B) write( B)
12Swapping Operations
- Nonconflicting operations (of separate
transactions) that are consecutive in some
schedule can be swapped, yielding an equivalent
schedule - (Note that this schedule is not serial.)
T0 T1 read( A) write( A) read( A) write(
A) read( B) write( B) read( B) write( B)
T0 T1 read( A) write( A) read( A) read( B)
write( A) write( B) read( B) write( B)
13Conflict Serializable Schedule
- Conflict serializable schedule schedule that
can be transformed into a serial schedule by a
series of swaps of nonconflicting operations that
are sequential in the given schedule.
14Example
- Our example schedule is conflict serializable
T0 T1 read( A) write( A) read( A) write(
A) read( B) write( B) read( B) write( B)
T0 T1 read( A) write( A) read(B) write(B)
read(A) write( A) read( B) write( B)
4 swaps to be applied here
15Ensuring Serializability
- Associate a lock with each data item.
- Transactions must follow a locking protocol.
- Governs how locks are acquired and released
- Data item can be locked in two modes
- Shared If Ti has obtained a shared-mode lock on
data item Q, then Ti can only read Q. - Exclusive If Ti has obtained an exclusive-mode
lock on data item Q,then Ti can both read and
write Q. (Readers and Writers) - But this is still not quite enough...
16Further Ensuring...
- Two-phase locking protocol
- Growing phase A transaction may obtain locks,
but may not release any lock. - Enter growing phase only if holding no locks.
- Shrinking phase A transaction may release locks,
but may not obtain any new locks. - The two-phase locking protocol ensures conflict
serializability, but does not ensure freedom from
deadlock. (Why not?) - 2 procs in growing phase, each waiting on release
of others locked item.
17Timestamping
- Locking provides an implicit ordering among
transactions - Depends on run-time order in which transactions
attempt simultaneous locks to same item with
conflicting modes. - Timestamping is an ordering protocol that
provides an explicit ordering, providing a
serializable order.
18Timestamping, Defined
- With each transaction Ti in the system, associate
a unique fixed timestamp, denoted by TS(Ti). - If Ti has been assigned timestamp TS(Ti), and a
new transaction Tk enters the system, then it is
timestamped so that TS(Ti) lt TS(Tk).
19Implementing Timestamps
- Implement by assigning two timestamp values to
each data item Q. - W-timestamp( Q) denotes largest timestamp of
any transaction that executed write( Q)
successfully. - R-timestamp( Q) denotes largest timestamp of
any transaction that executed read( Q)
successfully.
20Timestamping Conflicts
- The timestamp-ordering protocol ensures conflict
serializability conflicting operations are
processed in timestamp order. - Example, TS(T2) lt TS(T3)
- There are schedules that are possible under the
two-phase locking protocol but are not possible
under the timestamp protocol, and vice versa. - This schedule (which is serializable) is not
achievable with locking protocol.
T2 T3 read( B) read( B) write( B) read(
A) read( A) write( A)