Title: Distributed Transactions
1Distributed Transactions
- What is a transaction?
- (A sequence of server operations that must be
carried out atomically) - ACID properties - what are these
- (Atomicity, Consistency, Isolation, Durability)
- What is a distributed transaction?
- -Involves objects managed by multiple servers
communicating with one another.
2Transactions
Permanent Record
Commit / Abort
Shared variables
Server operation
Server operation
Server operation
Server operation
3Concurrency control
- The goal of concurrency control is to guarantee
that when multiple transactions are concurrently
executed, the net effect should be equivalent to
executing them in some serial order. This is the
essence of the serializability property.
4Example 1
T1 starts (20) W(x1) OK R(x) OK T1
commits T2 starts(30) W(x2) OK T2
commits T3 starts (40) W(x3) OK R(x) T3
commits This is serializable. Think
of other examples too.
5Example 2
T1 starts (20) W(x1) OK R(x) NO T1
aborts T2 starts(30) W(x2) OK R(x)
T2 commits T3 starts (40) W(x3) OK T3
commits This is not serializable.
6Question
Transaction 1 Raise the Q score of all GRE
candidates from Iowa City by 10
points Transaction 2 Raise the Q score of all
students whose id ends with 035 by 5 points
Can we run these concurrently? Explain.
7Pitfalls in concurrency control
- Dirty read
- Lost update
- Premature write
8Lost update
Initially, B 1000
- Amys transaction Bobs transaction
- 1 Load B into local 4 Load B into local
- Add 250 to local 5 Add 250 to local
- Store local to B 6 Store local to B
- What if the interleaving is 1 4 2 5 3 6 ? The
final value of B is 1250, although it should
have been 1500
9Dirty read
- Initially B 1000
- Amys transaction Bobs transaction
- 1 Load B into local 4 Load B into local
- Add 250 to local 5 Add 250 to local
- Store local to B 6 Store local to B
- ABORT COMMIT
- Execute the actions in the sequence 1 2 3 4 5 6.
The final result is still 1500, although it
should have been 1250
10Premature write
- Initially B 0
- Amys transaction Bobs transaction
- 1 B 500 2 B 1000
- 3 COMMIT
-
- 4 ABORT
-
- B changes to 0. This could have been avoided if
the - second transaction postponed its commit UNTIL
- the first transaction commits or aborts.
11Locks
- Locks are commonly used to implement
serrializability of concurrent transactions.
Operations on shared objects are in conflict when
one of them is a write operation. Each
transaction must acquire the corresponding
exclusive lock before executing an action. - Locks can be fine grained. Note that there is no
conflict between two reads.
12Serializability
- The serialization graph is a directed graph (V,
E) where V is the set of transactions, and E is
the set of directed edges between transactions -
a directed edge from a transaction Tj to a
transaction Tk implies that Tk applied a lock
only after Tj released the corresponding lock.
Tj
Tk
13Serializability theorem
- For a set of concurrent transaction, the
serializability property holds if and only if the
corresponding serialization graph is acyclic
14Two-phase locking (2PL)
- Phase 1. Acquire all locks needed to execute the
transaction. The locks will be acquired one after
another, and this phase is called the growing
phase or acquisition phase - Phase 2. Release all locks acquired so far. This
is called the shrinking phase or the release
phase.
15Two-phase locking (2PL)
acquire
release
Growing phase
Shrinking phase
162PL
- Theorem. 2PL guarantees serializability.
-
- Proof. Suppose that the theorem is not correct.
Then the serialization graph must contain a cycle
Tj ? Tk ? Tm ? Tj This implies that Tj must
have released a lock (that was later acquired by
Tk) and then acquired a lock (that was released
by Tm). However this violates the condition of
two-phase locking that rules out any locking once
a lock has been released.
17Atomic Commit Protocols
-
- Network of servers
- The initiator of a transaction is called the
coordinator, - and the remianing servers are participants
-
S1
Servers may crash
S3
S2
18Requirements of Atomic Commit Protocols
S1
- Network of servers
- Termination. All non-faulty servers must
eventually reach an irrevocable decision. - Agreement. If any server decides to commit, then
every server must have voted to commit. - Validity. If all servers vote commit and there is
no failure, then all servers must commit. -
Servers may crash
S3
S2
19One-phase Commit
server
participant
Commit
server
server
client
participant
coordinator
server
participant
If a participant deadlocks or faces a local
problem then the coordinator may never be able to
find it. Too simplistic.