Title: Distributed Transaction Management
1Distributed Transaction Management
- Transactions and Data Synchronization
- Flat and Nested Distributed Transactions
- Atomic Commit Protocol
- Concurrency Control for Distributed Transactions
- Distributed Deadlock Avoidance and Detection
2Process/Data Synchronization
- Consider a client/server database system
- To improve the system performance, the server may
be executing multiple processes (threads)
concurrently from different clients - Each process may access to a set of data objects
maintained in a database - Some of the data objects are accessed (shared) by
more than one processes concurrently (i.e.,
before the end of the last accessed one) - A process may update the value of a shared data
object and affect the execution result of another
process which is also accessing to the same data
object - The synchronization of accesses to shared data
objects from different processes is called data
synchronization - The purposes of data synchronization are to
ensure - The correctness of data objects which are
persistent objects - What is the meaning of persistent objects? Data
items in a database, i.e., your name, your
address. So, what are persistent objects? - The correctness of the results returned from
process execution
3Process/Data Synchronization
4Process/Data Synchronization
- How to achieve the objectives
- By prevention or avoidance (control interleaving
of process execution) - What is the difference between Prevention and
avoidance? - Two processes cannot access to the same data
object at the same time - When one process invokes a method (operation) to
access to a data object, the data object is
locked (mutual exclusion) - We cannot lookup an account and at the same time
withdraw from the same account - What are the methods/approaches for achieving
mutual exclusion - Operations that are free from interference from
concurrent operations (processes) and have to be
done in a single step are called atomic
operations (cannot be stopped in the middle) - Cost the concurrency in process execution is
lowered - Poor in performance (longer response time)
- Longer waiting time for accessing a shared object
5What is a transaction?
- From a process to a transaction
- The generation of a process normally has specific
purposes and each process may access to multiple
data items - Specific purpose I.e., buying a ticket or having
a dinner. It has a corresponding event in our
world and is not just a simulated event in the
computer - Definition of a transaction from user viewpoint
- The execution of a program to perform a function
(functions) by accessing a shared data items
(database), usually on behalf of a user
(application) - A transaction
- Is a process (concurrent processes)
- Each process consists of multiple atomic steps
- Each step is called an operation
- Two types of operations
- Database operations a collection of operations,
usually read and write, on the database, together
with some computation - Transaction operations a begin operation and end
operation - ACID requirements (atomic, consistency, isolation
and durability)
6Distributed Transactions
- A transaction becomes distributed if it invokes
operations at different servers - Requirements in processing of a distributed
transaction - Maintain data consistency (ensure the correctness
of results and the values of data items in a
database) - Use a distributed concurrency control protocols
- Maintain atomicity of distributed transactions
- Different processes of the same transaction end
up in the same termination decision (complete or
abort) (much more difficult than in a centralized
system, why??) - Use an atomic commit protocol (all commit or all
abort) - An atomic commit protocol must have a failure
model to ensure the transaction atomicity and
durability even when different types of failures
(process and network failures) may occur - Consider of the use of a new transaction model
(other than the flat model) to minimize the cost
of transaction abort - What is a transaction model? What is a flat
transaction model? - Nested transaction model
7Transaction structure database consistency
Database may be temporarily in an inconsistent
state during execution
Database in a consistent state
Database in a consistent state
Execution of Transaction
Begin Transaction
End Transaction
8Operations in Coordinator interface
openTransaction() -gt trans starts a new
transaction and delivers a unique TID trans. This
identifier will be used in the other operations
in the transaction. closeTransaction(trans) -gt
(commit, abort) ends a transaction a commit
return value indicates that the transaction has
committed an abort return value indicates that
it has aborted. abortTransaction(trans) aborts
the transaction.
The abort cost could be heavy if the transaction
is long
9Transaction life histories
Successful
Aborted by client
Aborted by server
openTransaction
openTransaction
openTransaction
operation
operation
operation
operation
operation
operation
server aborts
transaction
operation
operation
operation ERROR
reported to client
closeTransaction
abortTransaction
10Distributed Transactions
- Distributed transactions
- Multiple processes needed be created for a
transaction at different servers to access to
distributed data objects maintained by the
servers - The multiple processes of a transaction are
coordinated by a coordinator (distributed
transaction model) - A client starts a transaction by sending an
OpenTransaction request to a server that manages
the required data objects - The coordinator returns the TID to the client and
it will responsible for committing or aborting
transaction at the end of the transaction - The process responsible for accessing the
required data objects is called participant. It
is responsible for keeping tracking of all the
recoverable (restore to the original value)
objects for a transaction - Problems for execution of distributed
transactions - Long execution time
- Highly affected by the performance of the network
(retransmit) - Impacts of message lost and process failure could
be very serious
11Distributed Transaction Model
Example
Master Process
Cohort 1 Site 1
Cohort 2 Site 2
Cohort 3 Site 3
12Distributed Transactions
- Flat Vs nested distributed transactions
- Flat distributed transaction
- No sub-transaction -gt single threads of control
- One coordinator
- Multiple participants
- Sequential execution of participants
- Nested distributed transactions
- Sub-transactions and nested sub-transactions
- Each sub-transaction starts after its parent and
finishes before it - One sub-transaction -gt one coordinator for
coordinating its participants - A participant can be a coordinator if it has
sub-transactions - Different coordinators could have different
commit/abort decisions - Parallel execution of sub-transactions is
possible - Atomicity may only be applied at sub-transaction
level
13Distributed Transactions
(a) Flat transaction
(b) Nested transactions
X, Y, Z are servers connected by a network
14Nested Banking Transaction (Skip)
A client transfers 10 from account A to C, and
then transfers 20 from B to D. A and B are at
separate servers X and Y, and C and D are at
server Z
15A Distributed Banking Transaction
Client transaction T transfers 4 from account A
to account C, and then transfers 3 from account
B to D
16Two-phase Commit Protocol
- Ensure atomicity and durability properties of
distributed transactions - All the processes of a transaction/sub-transaction
have to reach the same decision (commit/abort) - A process cannot reverse its decision after it
has reached one - The (global) commit decision can only be reached
if all processes voted yes - vote yes means that it is willing to commit
- Once a transaction has committed, all its effects
become permanent even failures occur - The commit/abort decision is made by the
coordinator of a transaction (sub-transaction) - The coordinator collects votes from its
participants through messages exchanges
17Operations for Two-Phase Commit
canCommit?(trans)-gt Yes / No Call from
coordinator to participant to ask whether it can
commit a transaction. Participant replies with
its vote doCommit(trans) Call from coordinator
to participant to tell participant to commit its
part of a transaction doAbort(trans) Call from
coordinator to participant to tell participant to
abort its part of a transaction haveCommitted(tran
s, participant) Call from participant to
coordinator to confirm that it has committed the
transaction getDecision(trans) -gt Yes / No Call
from participant to coordinator to ask for the
decision on a transaction after it has voted Yes
but has still had no reply after some delay. Used
to recover from server crash or delayed messages
182PC Steps
- Phase 1 (voting phase) to collect the decision
of individual process (participant) of a
distributed transaction for commit or abort - Phase 2 (decision phase) make the final global
commit or abort decision and ensure that
everybody writes the results into the database - Global Commit Rule
- Aborts a transaction if and only if at least one
participant votes to abort - Commits a transaction if and only if all of the
participants vote to commit
19The Two-phase Commit Protocol
Phase 1 (voting phase) 1. The coordinator
sends a canCommit? request to each of the
participants in the transaction 2. When a
participant receives a canCommit? request it
replies with its vote (Yes or No) to the
coordinator. Before voting Yes, it prepares to
commit by saving objects in permanent storage. If
the vote is No the participant aborts
immediately Phase 2 (completion according to
outcome of vote) 3. The coordinator collects
the votes (including its own) (a) If there are
no failures and all the votes are Yes the
coordinator decides to commit the transaction and
sends a doCommit request to each of the
participants (b) Otherwise the coordinator
decides to abort the transaction and sends
doAbort requests to all participants that voted
Yes 4. Participants that voted Yes are waiting
for a doCommit or doAbort request from the
coordinator. When a participant receives one of
these messages, it acts accordingly and in the
case of commit, makes a haveCommitted call as
confirmation to the coordinator
20Communication in Two-phase Commit
21Uncertainty Period in 2PC
- A participant is in uncertainty period after it
sends a Yes vote to the coordinator - It has to wait for the final decision from the
coordinator and cannot decide to abort - The period ends when it receives a commit or
abort message - What should a participant do after it has waiting
for a long period of time still does not receive
any decision from its coordinator? Time-out and
then decide to abort? - The coordinator has no uncertainty period since
it decides as soon as it votes
22Failure Handling in 2PC
- Two Phase Commit is resilient to all types of
failures in which no log information is lost - It deals with the problem of failures by writing
log information into stable storage before any
decision is made - begin_commit, commit, end_of_tran (coordinator)
- ready log (abort log), commit log (participants)
- 2PC uses time-out to resolve the failures (making
an assumption on the longest time to receive a
reply) - A time-out occurs when it cannot get an expected
message from another process within the expected
time period - Failure types
- site failures - participant fails or coordinator
fails - loss messages I.e.,
- an answer from a participant is lost
- a vote request canCommit is lost
- final decision is lost
23Failure Handling in 2PC
- Participant fails before writing the vote into
the log - The coordinators timeout expires and it decides
to abort - Participant fails after having written the vote
into the log - When the participant recovers, it aborts if its
vote is No - If its vote is yes, it try to find out the final
decision from the coordinator or other
participants (the log must be written before
sending out the vote) - Coordinator fails after having written the vote
request but before the having written the final
decision - All participants which have already answer Yes
must wait for the recovery of the coordinator for
the final decision - When the coordinator recovers, it sends vote
requests again
24Failure Handling in 2PC
- Coordinator fails after having written the final
decision but before having written the complete
record - When the coordinator recovers, it send the
decision to all the participants - The answer of a participant is lost
- The coordinator timeouts and it decides to abort
- The vote request message is lost
- The coordinator timeouts and it decides abort
- The final decision is lost
- The participants which have voted yes have to
wait. After the time-out interval, it must ask
the coordinator for final decision
25Performance of 2PC
- Measures time duration and no. of communication
messages - Performance of 2PC depends very much on the
system architecture - Centralized
- Hierarchical
- Linear
- Centralized (for N participants)
- Total no. of messages 4 (N-1)
- Time delay 4T
- Hierarchical
- depends on configuration
- messages may be smaller and time may be longer
- Linear
- no. of messages will be smallest but the time
required is the longest
26Centralized 2PC
27Linear 2PC
28Distributed 2PC
292PC for Nested Transactions (Skip)
- When a sub-transaction completes, it makes an
independent decision either to commit
provisionally or abort. If a parent aborts, all
its sub-transactions are forced to abort - When a sub-transaction provisionally commits, it
reports its status and the status of its
decedents to its parent (coordinator) - If a nested sub-transaction aborts, it just
reports abort to its parent without giving any
information about its descendants - The top level transaction has a list of all the
sub-transactions in the tree together with their
commit status - Descendants of aborted sub-transactions are
omitted from the list
302PC for Nested Transactions (Skip)
- The parent may commit even some of its children
have decided to abort - After all its sub-transactions have completed,
the provisionally committed sub-transactions
participate in a 2PC - Note provisional commit is not backed up in
stable storage. In case of crash, it cannot be
recovered. A provisional commit indicates a
sub-transaction has completed successfully only
and will probably agree to commit when it is
subsequently asked to - Normally, the only reason for a participant
sub-transaction being unable to commit is it has
crashed since it completed its provisional commit
312PC for Nested Transactions (Skip)
- How to identify the set of sub-transactions for
commit in the hierarchy? - a hierarchy approach
- a linear approach
- A sub-transaction is an orphan if one of its
ancestor aborts. It will not take part in the
commit decision and will eventually be aborted - A client starts a set of nested transactions by
opening a top-level transaction with an
openTransaction operation and an TID is assigned - A client starts a sub-transaction by invoking the
openSubTransaction operation whose argument
specifies its parent transaction - The new sub-transaction joins the parent
transaction and a TID for it is returned
32Operations in Coordinator for Nested
Transactions (Skip)
openSubTransaction(trans) -gt subTrans Opens a new
subtransaction whose parent is trans and returns
a unique subtransaction identifier getStatus(tra
ns)-gt committed, aborted, provisional Asks the
coordinator to report on the status of the
transaction trans. Returns values representing
one of the following committed, aborted,
provisional
33Transaction T decides whether to commit (Skip)
34Information held by coordinators of nested
transactions (Skip)
Coordinator of
Child
Participant
Provisional
Abort list
transaction
transactions
commit list
T
T
, T
yes
T
, T
T
, T
1
2
1
12
11
2
T
T
, T
yes
T
, T
T
1
11
12
1
12
11
T
T
, T
no (aborted)
T
2
21
22
2
T
no (aborted)
T
11
11
T
T
, T
T
but not
T
, T
21
12
21
12
21
12
T
no (parent aborted)
T
22
22
35Hierarchical 2PC for Nested Transactions (Skip)
- The coordinator of the top-level sends
canCommit to the coordinators of the
sub-transactions for which it is the immediate
parent level by level - canCommit?(trans, subTrans) -gt Yes / No
- Call a coordinator to ask coordinator of child
subtransaction whether it can commit a
subtransaction subTrans. The first argument trans
is the transaction identifier of top-level
transaction. Participant replies with its vote
Yes / No - Each participant collects the replies from its
descendants (sub-transactions) before replying to
its parent (coordinator)
36Flat 2PC for Nested Transactions (Skip)
- The coordinator of the top level transaction
sends canCommit to the coordinators of all the
sub-transactions at all levels - The list of aborted sub-transactions are included
in the message to eliminate them from the commit
procedure - canCommit?(trans, abortList) -gt Yes / No
- Call from coordinator to participant to ask
whether it can commit a transaction. Participant
replies with its vote Yes / No - When a participant (coordinator) receives a
canCommit request - If the participant has any provisionally
committed transactions that are descendants of
the top-level transactions, trans - Check that they do not have aborted ancestors in
the abortList. Then prepare to commit - Those with aborted ancestors are aborted
- Otherwise, send a Yes reply to the coordinator
37Concurrency Control using Locking
- Simple locks (atomic operations) cannot resolve
the data synchronization problem for transactions
(the schedule could be non-serializable) - Strict execution delay the reading and updating
of a data object until the previous transaction
that has updated the same data object has
committed/aborted - An execution is recoverable if all the effects
of an aborted transaction can be removed
(all-or-none property) - To ensure recoverability
- If a transaction has read an uncommitted data,
not allow it to commit before the transaction
writing the data has committed
38 Recoverability Example
An unrecoverable schedule due to dirty read
Transaction T BankDeposit ( A, 3)
Transaction U BankDeposit ( A, 5)
balance A.Read () 100 A.Write (balance
3) 103
balance A.Read () 103 A.Write (balance
5) 108
Commit transaction
Abort transaction
39Concurrency Control using Locking
- Methods have to be designed to work with locking
- i.e., Two phase locking
- Locking a data object before accessing
(read/write) it (growing phase) - Once a transaction releases a lock, it cannot
submit any lock request (shrinking phase) - I.e., locks are released just before the commit
of a transaction - No sharing of uncommitted data objects in
conflicting modes among concurrently executing
transactions
40Transactions T U with Exclusive Locks
Transaction
T
Transaction
U
balance b.getBalance()
balance b.getBalance()
b.setBalance(bal1.1)
b.setBalance(bal1.1)
a.withdraw(bal/10)
c.withdraw(bal/10)
Operations
Locks
Operations
Locks
openTransaction
bal b.getBalance()
lock
B
openTransaction
b.setBalance(bal1.1)
bal b.getBalance()
waits for
T
s
A
a.withdraw(bal/10)
lock
lock on
B
closeTransaction
unlock
A
,
B
lock
B
b.setBalance(bal1.1)
C
c.withdraw(bal/10)
lock
closeTransaction
unlock
B
,
C
41Lock Compatibility
For one object
Lock requested
read
write
Lock already set
none
OK
OK
read
OK
wait
write
wait
wait
42Use of locks in Strict Two-phase Locking
1. When an operation accesses an object within a
transaction (a) If the object is not already
locked, it is locked and the operation
proceeds. (b) If the object has a conflicting
lock set by another transaction, the transaction
must wait until it is unlocked. (c) If the object
has a non-conflicting lock set by another
transaction, the lock is shared and the operation
proceeds. (d) If the object has already been
locked in the same transaction, the lock will be
promoted if necessary and the operation proceeds.
(Where promotion is prevented by a conflicting
lock, rule (b) is used.) 2. When a transaction is
committed or aborted, the server unlocks all
objects it locked for the transaction
43Lock Class
public class Lock private Object object //
the object being protected by the lock private
Vector holders // the TIDs of current
holders private LockType lockType // the
current type public synchronized void
acquire(TransID trans, LockType aLockType
) while(/another transaction holds the lock
in conflicing mode/) try
wait() catch ( InterruptedException
e)/.../ if(holders.isEmpty()) //
no TIDs hold lock holders.addElement(trans)
lockType aLockType else if(/another
transaction holds the lock, share it/ ) )
if(/ this transaction not a holder/)
holders.addElement(trans) else if (/ this
transaction is a holder but needs a more
exclusive lock/) lockType.promote()
44continued
public synchronized void release(TransID trans
) holders.removeElement(trans) // remove
this holder // set locktype to
none notifyAll()
45LockManager Class
public class LockManager private Hashtable
theLocks public void setLock(Object
object, TransID trans, LockType lockType)
Lock foundLock synchronized(this) //
find the lock associated with object //
if there isnt one, create it and add to the
hashtable
foundLock.acquire(trans, lockType) //
synchronize this one because we want to remove
all entries public synchronized void
unLock(TransID trans) Enumeration e
theLocks.elements() while(e.hasMoreElements()
) Lock aLock (Lock)(e.nextElement()
) if(/ trans is a holder of this
lock/ ) aLock.release(trans)
46Locking Rules for Nested Transactions (Skip)
- Two rules
- Each set of nested transactions is a single
entity that must be prevented from observing the
partial effects of any other set of nested
transactions - Each transaction within a set of nested
transactions must be prevented from observing the
partial effects of the other transactions in the
set - Every lock that is acquired by a successful
sub-transaction is inherited by its parent when
it completes - Parent transactions are not allowed to run
concurrently with their child transactions - Sub-transactions at the same level are allowed to
run concurrently. When they access the same
objects, locks serialize their accesses - For a sub-transaction to acquire a read/write
lock, no other transaction can have a write lock
on it except the parent transaction - When a sub-transaction commits, its locks are
inherited by its parent - When a sub-transaction aborts, its locks are
discarded
47Nested Transactions Example (Skip)
commit
Suppose T1, T2 and T11 all access to a common
object Suppose that T1 accesses the object first
and successfully acquire a lock, which it passes
on to T11 for the duration of its execution,
getting it back when T11 completes. When T1
completes, T inherits the lock and passes it to T2
48Concurrency Control for Distributed Transactions
- Distributed Locking Vs centralized Locking
- Centralized locking
- A central server maintains a lock table
- The central server is responsible for the locking
of all the data objects in the system - All data access requests will be forwarded to the
central server for locking first - Distributed locking
- Each server maintains a lock table for the
locking of the data objects managed by it - A data access request will be forward to the
server responsible for locking that data object
49Distributed Deadlocks
- Deadlock involving processes located at more than
one server is called distributed deadlock - Using time-out to resolve deadlock is clumsy and
may result in unpredictable performance (repeat
restart of a large number of transactions). Why? - Deadlock avoidance Vs deadlock detection
- Deadlock avoidance
- To prevent the formation of deadlock cycle
- I.e. add rules in serving lock requests such that
deadlock cycle is impossible to form (i.e., a
change in locking procedure) - Deadlock detection
- Following the original rules for granting a lock
- Periodic (conditionally) search the wait-for
graph for deadlock cycle - In a distributed deadlock, the (global) WFG is
partitioned at multiple servers - How to search the global WFG without incurring
heavy workload (network servers)?
50Deadlock Avoidance using TS
- Deadlock avoidance prevent potential deadlock to
become deadlock - Each transaction is assigned a unique time-stamp,
e.g., its creation time (distributed dbs
creation time site ID) - Wait-die Rule (non-preemptive)
- If Ti requests a lock that is already locked by
Tj, Ti is permitted to wait if and only if Ti is
older than Tj (Tis time-stamp is smaller than
that of Tj) - If Ti is younger than Tj, Ti is restarted with
the same time-stamp - When Ti requests access to the same lock in the
second time, Tj may already have finished its
execution - Wound-Wait Rule (preemptive)
- If Ti requests a lock that is already locked by
Tj, Ti is permitted to wait if and only if Ti is
younger than Tj - Otherwise, Tj is restarted (with the same
time-stamp) and the lock is granted to Ti
51Deadlock Avoidance using TS
- If TS(Ti) lt TS(Tj), Ti waits else Ti dies
(Wait-die) - If TS(Ti) lt TS(Tj), Tj wounds else Ti waits
(Wound-wait) - Note a smaller TS means the transaction is older
- Note both methods restart the younger transaction
- Both methods prevent cyclic wait
- Consider this deadlock cycle T1-gtT2-gtT3-gt-gtTn-gtT
1 - It is impossible since if T1 -gtTn, then Tn is
not allowed to wait for T1 - Wait-die Older transaction is allowed to wait
- Wound-wait Older transaction is allowed to get
the lock
52Deadlock Example
Transaction U TS of U lt TS of T
Transaction T
Read (A) Write (B)
Read (C) Write (A) (blocked)
Write (C) (blocked) deadlock formed
53Deadlock Example (wait-die)
Transaction U TS of U lt TS of T
Transaction T
Read (A) Write (B)
Read (C) Write (A) (restarts) T is
restarted since it is younger than U T
releases its read lock on C before restart
Write (C)
54Deadlock Example (wound-wait)
Transaction U TS of U lt TS of T
Transaction T
Read (A) Write (B)
Read (C) Write (A) (blocked) since T is
younger than U
Write (C) T is restarted by U since T is
younger than U The write lock on C is granted
to U after T has released its read lock on C
55Deadlock Resolution by time-out
- A simple method to break a deadlock cycle is the
time-out method - Once a deadlock is formed, it will exist forever
until it is resolved - In the time-out method, two parameters are
defined a time-out period (TP) and a time-out
checking period (TCP). Normally, TPgtgtTCP - The time-out checking period defines the period
for checking the blocked transactions (at the
lock table) for deadlock - If a transaction has been blocked for a period of
time greater than the time-out period, it will be
restarted as it is assumed to be involved in a
deadlock - So, no deadlock cycle exists in the system longer
than TP TPC
56Deadlock Resolution by time-out
- The problems in using the time-out method
- How to define the time-out period (and TCP)
- If it is large, a deadlock cycle will exist in
the system for a long period of time - If it is small, many transactions will be
restarted even though they are not involved in
any deadlocks (false deadlock) - The advantages
- Simple in implementation and the overhead of
using the time-out method is low and depends on
the values of TCP and TP - No undetected deadlock (can resolve all deadlocks)
57Interleaving of Transactions U, V and W
U
V
W
lock
D
d.deposit(10)
lock
B
b.deposit(10)
at
Y
lock
A
a.deposit(20)
at
X
lock
C
c.deposit(30)
at
Z
wait at
Y
b.withdraw(30)
wait at
Z
c.withdraw(20)
wait at
X
a.withdraw(20)
58Distributed Deadlock
(a)
(b)
59Local and Global wait-for Graphs
60Distributed Deadlock Detection
- Merging the local WFGs at different servers to
build a global wait-for graph - How and when to submit the local WFG at a server
to other servers? - Too frequent heavy overhead
- Too infrequent
- phantom deadlock (a deadlock is detected but it
is not a real one) - Blocking time is long
- Edge chasing
- To reduce the detection overhead,
- Send the blocking relationship when a transaction
is blocked - Not to send the local WFG to all servers. Only to
those there is a potential deadlock - Not to send the whole WFG is sent. Only sending
the nodes that are sufficient for deadlock
detection
61Edge Chasing
- The servers attempt to find deadlock cycles by
forwarding probes, which follow the edges of the
graph throughout the distributed system - A probe consists of transaction wait-for
relationships representing a path in the global
wait-for graph (ltT-gtUgt indicating T is blocked by
U) - When a probe returns to the server that generates
it, a distributed deadlock is detected - Initiation step
- When a server notes that a transaction T starts
waiting for another blocked transaction U (lock
reject), it initiates a detection by sending a
probe containing the edge ltT-gtUgt to the server
that of the object at which U is blocked. If U is
sharing a lock, probes are sent to all the
holders of the lock
62Edge Chasing
- Detection step
- When a server receives a probe ltT-gtUgt, it checks
whether U is still waiting. If yes (i.e., waiting
for V), it adds the edge to the probe ltT-gtU-gtVgt.
If V is blocked, forward the update probe to the
server that V is waiting - Resolution when a deadlock cycle is formed, a
transaction in the cycle is selected to rollback
(release all its locks)
63Probes transmitted to detect deadlock
64Edge Chasing with Priorities
- U-gtW V-gtT at about the same time, T requests an
object locked by U and W is blocked by V - Two probes are triggered and the deadlock cycle
is detected twice - Transactions are prioritized to reduce the number
of probes (from higher priority to lower
priority, TgtUgtVgtW) - Aborting the lowest priority transaction in the
deadlock cycle - Reducing the number of probes
- T-gtU initiates a probe
- W-gtV, W-gtV, the probe will not be sent
- How about if W is blocked by V is the last formed
edge in the cycle?
65Two probes Initiated
(c) detection initiated at object requested by W
(a) initial situation
(b) detection initiated at object requested by T
66Probe Travel Downhill
- When a transaction starts waiting for an object,
it forwards the probes in its queue to the server
of the object, which propagates the probes on
downhill routes - When U starts waiting for V, the coordinator of V
will save the probe ltU-gtVgt - When V starts waiting for W, the coordinator of W
will store ltV-gtWgt and V will forward its probe
queue ltV-gtWgt to W - When W starts waiting for A, it will forward its
probe queue ltU-gtV-gtWgt to the server of A, which
also notes the dependency W-gtU and combines the
information in the received probe U-gtV-gtW-gtU
67Probes Travel Downhill
.
.
(b) Probe is forwarded when V starts waiting
(a) V stores probe when U starts waiting
68References
- Dollimore 13.1 to 13.41, 14.41, 14.5
- Tanenbaum 5.6 and 7.5.1