Distributed Systems Course Distributed transactions

About This Presentation

Title:

Distributed Systems Course Distributed transactions

Description:

Title: Figure 15.1 A distributed multimedia system Author: George Coulouris Last modified by: czhang Created Date: 6/18/2000 9:59:47 PM Document presentation format – PowerPoint PPT presentation

Number of Views:252

Avg rating:3.0/5.0

Slides: 52

Provided by: George508

Learn more at: http://users.cis.fiu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Systems Course Distributed transactions

1
Distributed Systems Course Distributed
transactions

13.1 Introduction
13.2 Flat and nested distributed transactions
13.3 Atomic commit protocols
13.4 Concurrency control in distributed
transactions
13.5 Distributed deadlocks
13.6 Transaction recovery

2
Commitment of distributed transactions -
introduction

a distributed transaction refers to a flat or
nested transaction that accesses objects managed
by multiple servers
When a distributed transaction comes to an end
the either all of the servers commit the
transaction
or all of them abort the transaction.
one of the servers is coordinator, it must ensure
the same outcome at all of the servers.
the two-phase commit protocol is the most
commonly used protocol for achieving this

3
Distributed transactions
In a nested transaction, the top-level
transaction can open subtransactions, and each
subtransaction can open further subtransactions
down to any depth of nesting
In the nested case, subtransactions at the same
level can run concurrently, so T1 and T2 are
concurrent, and as they invoke objects in
different servers, they can run in parallel.
A flat client transaction completes each of its
requests before going on to the next one.
Therefore, each transaction accesses servers
objects sequentially

4
Nested banking transaction
requests can be run in parallel - with several
servers, the nested transaction is more efficient

client transfers 10 from A to C and then
transfers 20 from B to

5
The coordinator of a flat distributed transaction
Why might a participant abort a transaction?

Servers execute requests in a distributed
transaction
when it commits they must communicate with one
another to coordinate their actions
a client starts a transaction by sending an
openTransaction request to a coordinator in any
server (next slide)
it returns a TID unique in the distributed
system(e.g. server ID local transaction number)
at the end, it will be responsible for committing
or aborting it
each server managing an object accessed by the
transaction is a participant - it joins the
transaction (next slide)
a participant keeps track of objects involved in
the transaction
at the end it cooperates with the coordinator in
carrying out the commit protocol
note that a participant can call abortTransaction
in coordinator

6
A flat distributed banking transaction
a clients (flat) banking transaction involves
accounts A, B, C and D at servers BranchX,
BranchY and BranchZ
Each server is shown with a participant, which
joins the transaction by invoking the join method
in the coordinator

Note that the TID (T) is passed with each request
e.g. withdraw(T,3)

7
The join operation

The interface for Coordinator is shown in Figure
12.3
it has openTransaction, closeTransaction and
abortTransaction
openTransaction returns a TID which is passed
with each operation so that servers know which
transaction is accessing its objects
The Coordinator interface provides an additional
method, join, which is used whenever a new
participant joins the transaction
join(Trans, reference to participant)
informs a coordinator that a new participant has
joined the transaction Trans.
the coordinator records the new participant in
its participant list.
the fact that the coordinator knows all the
participants and each participant knows the
coordinator will enable them to collect the
information that will be needed at commit time.

8
Atomic commit protocols

transaction atomicity requires that at the end,
either all of its operations are carried out or
none of them.
in a distributed transaction, the client has
requested the operations at more than one server
one-phase atomic commit protocol
the coordinator tells the participants whether to
commit or abort
what is the problem with that?
this does not allow one of the servers to decide
to abort it may have discovered a deadlock or
it may have crashed and been restarted
two-phase atomic commit protocol
is designed to allow any participant to choose to
abort a transaction
phase 1 - each participant votes. If it votes to
commit, it is prepared. It cannot change its
mind. In case it crashes, it must save updates in
permanent store
phase 2 - the participants carry out the joint
decision

The decision could be commit or abort -
participants record it in permanent store

9
Failure model for the commit protocols

Recall the failure model for transactions in
Chapter 12
this applies to the two-phase commit protocol
Commit protocols are designed to work in
asynchronous system (e.g. messages may take a
very long time)
servers may crash
messages may be lost.
assume corrupt and duplicated messages are
removed.
no byzantine faults servers either crash or
they obey their requests
2PC is an example of a protocol for reaching a
consensus.
Chapter 11 says consensus cannot be reached in an
asynchronous system if processes sometimes fail.
however, 2PC does reach consensus under those
conditions.
because crash failures of processes are masked by
replacing a crashed process with a new process
whose state is set from information saved in
permanent storage and information held by other
processes.

10
The two-phase commit protocol
Why does participant record updates in permanent
storage at bthis stage?
How many messages are sent between the
coordinator and each participant?

During the progress of a transaction, the only
communication between coordinator and participant
is the join request
The client request to commit or abort goes to the
coordinator
if client or participant request abort, the
coordinator informs the participants immediately
if the client asks to commit, the 2PC comes into
use
2PC
voting phase coordinator asks all participants
if they can commit
if yes, participant records updates in permanent
storage and then votes
completion phase coordinator tells all
participants to commit or abort
the next slide shows the operations used in
carrying out the protocol

11
Operations for two-phase commit protocol
This is a request with a reply
Asynchronous request

participant interface- canCommit?, doCommit,
doAbortcoordinator interface- haveCommitted,
getDecision

12
The two-phase commit protocol

Phase 1 (voting phase)
1. The coordinator sends a canCommit? request to
each of the participants in the transaction.
2. When a participant receives a canCommit?
request it replies with its vote (Yes or No) to
the coordinator. Before voting Yes, it prepares
to commit by saving objects in permanent storage.
If the vote is No the participant aborts
immediately.
Phase 2 (completion according to outcome of
vote)
3. The coordinator collects the votes (including
its own).
(a)If there are no failures and all the votes are
Yes the coordinator decides to commit the
transaction and sends a doCommit request to each
of the participants.
(b)Otherwise the coordinator decides to abort the
transaction and sends doAbort requests to all
participants that voted Yes.
4. Participants that voted Yes are waiting for a
doCommit or doAbort request from the coordinator.
When a participant receives one of these messages
it acts accordingly and in the case of commit,
makes a haveCommitted call as confirmation to the
coordinator.

13
Communication in two-phase commit protocol
Think about the coordinator in step 1 - what is
the problem?
Think about step 2 - what is the problem for the
participant?
Think about participant before step 2 - what is
the problem?

Time-out actions in the 2PC
to avoid blocking forever when a process crashes
or a message is lost
uncertain participant (step 2) has voted yes. it
cant decide on its own
it uses getDecision method to ask coordinator
about outcome
participant has carried out client requests, but
has not had a Commit?from the coordinator. It can
abort unilaterally
coordinator delayed in waiting for votes (step
1). It can abort and send doAbort to
participants.

14
Performance of the two-phase commit protocol

if there are no failures, the 2PC involving N
participants requires
N canCommit? messages and replies, followed by
N doCommit messages.
the cost in messages is proportional to 3N, and
the cost in time is three rounds of messages.
The haveCommitted messages are not counted
there may be arbitrarily many server and
communication failures
2PC is is guaranteed to complete eventually, but
it is not possible to specify a time limit within
which it will be completed
delays to participants in uncertain state
some 3PCs designed to alleviate such delays
they require more messages and more rounds for
the normal case

15
13.3.2 Two-phase commit protocol for nested
transactions

Recall Fig 13.1b, top-level transaction T and
subtransactions T1, T2, T11, T12, T21, T22
A subtransaction starts after its parent and
finishes before it
When a subtransaction completes, it makes an
independent decision either to commit
provisionally or to abort.
A provisional commit is not the same as being
prepared it is a local decision and is not
backed up on permanent storage.
If the server crashes subsequently, its
replacement will not be able to carry out a
provisional commit.
A two-phase commit protocol is needed for nested
transactions
it allows servers of provisionally committed
transactions that have crashed to abort them when
they recover.

16
Figure 13.7Operations in coordinator for nested
transactions
The TID of a subtransaction is an extension of
its parent's TID, so that a subtransaction can
work out the TID of the top-level transaction.
The client finishes a set of nested transactions
by calling closeTransaction or abortTransacation
in the top-level transaction.
openSubTransaction(trans) -gt subTrans Opens a new
subtransaction whose parent is trans and returns
a unique subtransaction identifier. getStatus(tran
s)-gt committed, aborted, provisional Asks the
coordinator to report on the status of the
transaction trans. Returns values representing
one of the following committed, aborted,
provisional.

This is the interface of the coordinator of a
subtransaction.
It allows it to open further subtransactions
It allows its subtransactions to enquire about
its status
Client starts by using OpenTransaction to open a
top-level transaction.
This returns a TID for the top-level transaction
The TID can be used to open a subtransaction
The subtransaction automatically joins the parent
and a TID is returned.

17
Transaction T decides whether to commit
T12 has provisionally committed and T11 has
aborted, but the fate of T12 depends on its
parent T1 and eventually on the top-level
transaction, T.
Although T21 and T22 have both provisionally
committed, T2 has aborted and this means that T21
and T22 must also abort.
Suppose that T decides to commit although T2 has
aborted, also that T1 decides to commit although
T11 has aborted
Figure 13.8

Recall that
A parent can commit even if a subtransaction
aborts
If a parent aborts, then its subtransactions must
abort
In the figure, each subtransaction has either
provisionally committed or aborted

18
Information held by coordinators of nested
transactions

When a top-level transcation commits it carries
out a 2PC
Each coordinator has a list of its
subtransactions
At provisional commit, a subtransaction reports
its status and the status of its descendents to
its parent
If a subtransaction aborts, it tells its parent

Figure 13.9

T12 and T21 share a coordinator as they both run
at server N
When T2 is aborted it tells T (no information
about descendents)
A subtransaction (e.g. T21 and T22) is called an
orphan if one of its ancestors aborts
an orphan uses getStatus to ask its parent about
the outcome. It should abort if its parent has
19
canCommit? for hierarchic two-phase commit
protocol
canCommit?(trans, subTrans) -gt Yes / No Call a
coordinator to ask coordinator of child
subtransaction whether it can commit a
subtransaction subTrans. The first argument trans
is the transaction identifier of top-level
transaction. Participant replies with its vote
Yes / No.
Figure 13.10

Top-level transaction is coordinator of 2PC.
participant list
the coordinators of all the subtransactions that
have provisionally committed
but do not have an aborted ancestor
E.g. T, T1 and T12 in Figure 13.8
if they vote yes, they prepare to commit by
saving state in permanent store
The state is marked as belonging to the top-level
transaction
The 2PC may be performed in a hierarchic or a
flat manner

Hierarchic 2PC - T asks canCommit? to T1 and T1
asks canCommit? to T12
The trans argument is used when saving the
objects in permanent storage
The subTrans argument is use to find the
subtransaction to vote on. If absent, vote no.

20
canCommit? for flat two-phase commit protocol
Compare the advantages and disadvantages of the
flat and nested approaches
canCommit?(trans, abortList) -gt Yes / No Call
from coordinator to participant to ask whether it
can commit a transaction. Participant replies
with its vote Yes / No.
Figure 13.11

Flat 2PC
the coordinator of the top-level transaction
sends canCommit? messages to the coordinators of
all of the subtransactions in the provisional
commit list.
in our example, T sends to the coordinators of T1
and T12.
the trans argument is the TID of the top-level
transaction
the abortList argument gives all aborted
subtransactions
e.g. server N has T12 prov committed and T21
aborted
On receiving canCommit, participant
looks in list of transactions for any that match
trans (e.g. T12 and T21 at N)
it prepares any that have provisionally committed
and are not in abortList and votes yes
if it can't find any it votes no

21
Time-out actions in nested 2PC

With nested transactions delays can occur in the
same three places as before
when a participant is prepared to commit
when a participant has finished but has not yet
received canCommit?
when a coordinator is waiting for votes
Fourth place
provisionally committed subtransactions of
aborted subtransactions e.g. T22 whose parent T2
has aborted
use getStatus on parent, whose coordinator should
remain active for a while
If parent does not reply, then abort

22
Summary of 2PC

a distributed transaction involves several
different servers.
A nested transaction structure allows
additional concurrency and
independent committing by the servers in a
distributed transaction.
atomicity requires that the servers participating
in a distributed transaction either all commit it
or all abort it.
atomic commit protocols are designed to achieve
this effect, even if servers crash during their
execution.
the 2PC protocol allows a server to abort
unilaterally.
it includes timeout actions to deal with delays
due to servers crashing.
2PC protocol can take an unbounded amount of time
to complete but is guaranteed to complete
eventually.

23
13.4 Concurrency control in distributed
transactions

Each server manages a set of objects and is
responsible for ensuring that they remain
consistent when accessed by concurrent
transactions
therefore, each server is responsible for
applying concurrency control to its own objects.
the members of a collection of servers of
distributed transactions are jointly responsible
for ensuring that they are performed in a
serially equivalent manner
therefore if transaction T is before transaction
U in their conflicting access to objects at one
of the servers then they must be in that order at
all of the servers whose objects are accessed in
a conflicting manner by both T and U

24
13.4.1 Locking

In a distributed transaction, the locks on an
object are held by the server that manages it.
The local lock manager decides whether to grant a
lock or make the requesting transaction wait.
it cannot release any locks until it knows that
the transaction has been committed or aborted at
all the servers involved in the transaction.
the objects remain locked and are unavailable for
other transactions during the atomic commit
protocol
an aborted transaction releases its locks after
phase 1 of the protocol.

25
Interleaving of transactions T and U at servers X
and Y

in the example on page 529, we have
T before U at server X and U before T at server Y
different orderings lead to cyclic dependencies
and distributed deadlock
detection and resolution of distributed deadlock
in next section

T U
Write(A) at X locks A
Write(B) at Y locks B
Read(B) at Y waits for U
Read(A) at X waits for T

26
13.4.2 Timestamp ordering concurrency control

Single server transactions
coordinator issues a unique timestamp to each
transaction before it starts
serial equivalence ensured by committing objects
in order of timestamps
Distributed transactions
the first coordinator accessed by a transaction
issues a globally unique timestamp
as before the timestamp is passed with each
object access
the servers are jointly responsible for ensuring
serial equivalence
that is if T access an object before U, then T is
before U at all objects
coordinators agree on timestamp ordering
a timestamp consists of a pair ltlocal timestamp,
server-idgt.
the agreed ordering of pairs of timestamps is
based on a comparison in which the server-id part
is less significant they should relate to time

27
Timestamp ordering concurrency control (continued)
Can the same ordering be achieved at all servers
without clock synchronization?
Why is it better to have roughly synchronized
clocks?

The same ordering can be achieved at all servers
even if their clocks are not synchronized
for efficiency it is better if local clocks are
roughly synchronized
then the ordering of transactions corresponds
roughly to the real time order in which they were
started
Timestamp ordering
conflicts are resolved as each operation is
performed
if this leads to an abort, the coordinator will
be informed
it will abort the transaction at the participants
any transaction that reaches the client request
to commit should always be able to do so
participant will normally vote yes
unless it has crashed and recovered during the
transaction

28
Optimistic concurrency control
Use backward validation
1. write/read, 2. read/write, 3. write/write

each transaction is validated before it is
allowed to commit
transaction numbers assigned at start of
validation
transactions serialized according to transaction
numbers
validation takes place in phase 1 of 2PC protocol
consider the following interleavings of T and U
T before U at X and U before T at Y

satisfied
checked
paralllel

Suppose T U start validation at about the same
time
T U
Read(A) at X Read(B) at Y
Write(A) Write(B)
Read(B) at Y Read(A) at X
Write(B) Write(A)
X does T first Y does U first
No parallel Validation . commitment deadlock

29
Commitment deadlock in optimistic concurrency
control

servers of distributed transactions do parallel
validation
therefore rule 3 must be validated as well as
rule 2
the write set of Tv is checked for overlaps with
write sets of earlier transactions
this prevents commitment deadlock
it also avoids delaying the 2PC protocol
another problem - independent servers may
schedule transactions in different orders
e.g. T before U at X and U before T at Y
this must be prevented - some hints as to how on
page 531

30
13.5 Distributed deadlocks

Single server transactions can experience
deadlocks
prevent or detect and resolve
use of timeouts is clumsy, detection is
preferable.
it uses wait-for graphs.
Distributed transactions lead to distributed
deadlocks
in theory can construct global wait-for graph
from local ones
a cycle in a global wait-for graph that is not in
local ones is a distributed deadlock

31
Figure 13.12Interleavings of transactions U, V
and W

objects A, B managed by X and Y C and D by Z
next slide has global wait-for graph

U ? V at Y
V ? W at Z
W ? U at X

32
Figure 13.13Distributed deadlock

a deadlock cycle has alternate edges showing
wait-for and held-by
wait-for added in order U ? V at Y V ? W at Z
and W ? U at X

(a)
(b)

33
Deadlock detection - local wait-for graphs

Local wait-for graphs can be built, e.g.
server Y U ? V added when U requests
b.withdraw(30)
server Z V ? W added when V requests
c.withdraw(20)
server X W ? U added when W requests
a.withdraw(20)
to find a global cycle, communication between the
servers is needed
centralized deadlock detection
one server takes on role of global deadlock
detector
the other servers send it their local graphs from
time to time
it detects deadlocks, makes decisions about which
transactions to abort and informs the other
servers
usual problems of a centralized service - poor
availability, lack of fault tolerance and no
ability to scale

34
Figure 13.14Local and global wait-for graphs

Phantom deadlocks
a deadlock that is detected, but is not really
one
happens when there appears to be a cycle, but one
of the transactions has released a lock, due to
time lags in distributing graphs
in the figure suppose U releases the object at X
then waits for V at Y
and the global detector gets Ys graph before Xs
(T ? U ? V ? T)

35
Edge chasing - a distributed approach to deadlock
detection

a global graph is not constructed, but each
server knows about some of the edges
servers try to find cycles by sending probes
which follow the edges of the graph through the
distributed system
when should a server send a probe (go back to Fig
13.13)
edges were added in order U ? V at Y V ? W at Z
and W ? U at X
when W ? U at X was added, U was waiting, but
when V ? W at Z, W was not waiting
send a probe when an edge T1 ? T2 when T2 is
waiting
each coordinator records whether its transactions
are active or waiting
the local lock manager tells coordinators if
transactions start/stop waiting
when a transaction is aborted to break a
deadlock, the coordinator tells the participants,
locks are removed and edges taken from wait-for
graphs

36
Edge-chasing algorithms

Three steps
Initiation
When a server notes that T starts waiting for U,
where U is waiting at another server, it
initiates detection by sending a probe containing
the edge lt T ? U gt to the server where U is
blocked.
If U is sharing a lock, probes are sent to all
the holders of the lock.
Detection
Detection consists of receiving probes and
deciding whether deadlock has occurred and
whether to forward the probes.
e.g. when server receives probe lt T ? U gt it
checks if U is waiting, e.g. U ? V, if so it
forwards lt T ? U ? V gt to server where V waits
when a server adds a new edge, it checks whether
a cycle is there
Resolution
When a cycle is detected, a transaction in the
cycle is aborted to break the deadlock.

37
Figure 13.15Probes transmitted to detect deadlock

example of edge chasing starts with X sending ltW
? Ugt, then Y sends ltW ? U ? V gt, then Z sends ltW
? U ? V ? Wgt

38
Edge chasing conclusion

probe to detect a cycle with N transactions will
require 2(N-1) messages.
Studies of databases show that the average
deadlock involves 2 transactions.
the above algorithm detects deadlock provided
that
waiting transactions do not abort
no process crashes, no lost messages
to be realistic it would need to allow for the
above failures
refinements of the algorithm (p 536-7)
to avoid more than one transaction causing
detection to start and then more than one being
aborted
not time to study these now

39
Figure 13.16Two probes initiated

40
Figure 13.17Probes travel downhill
.
.

41
Summary of concurrency control for distributed
transactions

each server is responsible for the
serializability of transactions that access its
own objects.
additional protocols are required to ensure that
transactions are serializable globally.
timestamp ordering requires a globally agreed
timestamp ordering
optimistic concurrency control requires global
validation or a means of forcing a global
ordering on transactions.
two-phase locking can lead to distributed
deadlocks.
distributed deadlock detection looks for cycles
in the global wait-for graph.
edge chasing is a non-centralized approach to the
detection of distributed deadlocks
.

42
13.6 Transaction recovery
What is meant by durability?
What is meant by failure atomicity?

Atomicity property of transactions
durability and failure atomicity
durability requires that objects are saved in
permanent storage and will be available
indefinitely
failure atomicity requires that effects of
transactions are atomic even when the server
crashes
Recovery is concerned with
ensuring that a servers objects are durable and
that the service provides failure atomicity.
for simplicity we assume that when a server is
running, all of its objects are in volatile
memory
and all of its committed objects are in a
recovery file in permanent storage
recovery consists of restoring the server with
the latest committed versions of all of its
objects from its recovery file

43
Recovery manager

The task of the Recovery Manager (RM) is
to save objects in permanent storage (in a
recovery file) for committed transactions
to restore the servers objects after a crash
to reorganize the recovery file to improve the
performance of recovery
to reclaim storage space (in the recovery file).
media failures
i.e. disk failures affecting the recovery file
need another copy of the recovery file on an
independent disk. e.g. implemented as stable
storage or using mirrored disks
we deal with recovery of 2PC separately (at the
end)
we study logging (13.6.1) but not shadow versions
(13.6.2)

44
Recovery - intentions lists

Each server records an intentions list for each
of its currently active transactions
an intentions list contains a list of the object
references and the values of all the objects that
are altered by a transaction
when a transaction commits, the intentions list
is used to identify the objects affected
the committed version of each object is replaced
by the tentative one
the new value is written to the servers recovery
file
in 2PC, when a participant says it is ready to
commit, its RM must record its intentions list
and its objects in the recovery file
it will be able to commit later on even if it
crashes
when a client has been told a transaction has
committed, the recovery files of all
participating servers must show that the
transaction is committed,
even if they crash between prepare to commit and
commit

45
Types of entry in a recovery file
Why is that a good idea?
Object state flattened to bytes
first entry says prepared

For distributed transactions we need information
relating to the 2PC as well as object values,
that is
transaction status (committed, prepared or
aborted)
intentions list

46
Logging - a technique for the recovery file

the recovery file represents a log of the history
of all the transactions at a server
it includes objects, intentions lists and
transaction status
in the order that transactions prepared,
committed and aborted
a recent snapshot a history of transactions
after the snapshot
during normal operation the RM is called whenever
a transaction prepares, commits or aborts
prepare - RM appends to recovery file all the
objects in the intentions list followed by status
(prepared) and the intentions list
commit/abort - RM appends to recovery file the
corresponding status
assume append operation is atomic, if server
fails only the last write will be incomplete
to make efficient use of disk, buffer writes.
Note sequential writes are more efficient than
those to random locations
committed status is forced to the log - in case
server crashes

47
Log for banking service

Logging mechanism for Fig 12.7 (there would
really be other objects in log file)
initial balances of A, B and C 100, 200, 300
T sets A and B to 80 and 220. U sets B and C
to 242 and 278
entries to left of line represent a snapshot
(checkpoint) of values of A, B and C before T
started. T has committed, but U is prepared.
the RM gives each object a unique identifier (A,
B, C in diagram)
each status entry contains a pointer to the
previous status entry, then the checkpoint can
follow transactions backwards through the file

48
Recovery of objects - with logging

When a server is replaced after a crash
it first sets default initial values for its
objects
and then hands over to its recovery manager.
The RM restores the servers objects to include
all the effects of all the committed transactions
in the correct order and
none of the effects of incomplete or aborted
transactions
it reads the recovery file backwards (by
following the pointers)
restores values of objects with values from
committed transactions
continuing until all of the objects have been
restored
if it started at the beginning, there would
generally be more work to do
to recover the effects of a transaction use the
intentions list to find the value of the objects
e.g. look at previous slide (assuming the server
crashed before T committed)
the recovery procedure must be idempotent

49
Logging - reorganising the recovery file

RM is responsible for reorganizing its recovery
file
so as to make the process of recovery faster and
to reduce its use of space
checkpointing
the process of writing the following to a new
recovery file
the current committed values of a servers
objects,
transaction status entries and intentions lists
of transactions that have not yet been fully
resolved
including information related to the two-phase
commit protocol (see later)
checkpointing makes recovery faster and saves
disk space
done after recovery and from time to time
can use old recovery file until new one is ready,
add a mark to old file
do as above and then copy items after the mark to
new recovery file
replace old recovery file by new recovery file

50
Figure 13.20Shadow versions
51
Recovery of the two-phase commit protocol

The above recovery scheme is extended to deal
with transactions doing the 2PC protocol when a
server fails
it uses new transaction status values done,
uncertain (see Fig 13.6)
the coordinator uses committed when result is
Yes
done when 2PC complete ( if a transaction is done
its information may be removed when reorganising
the recovery file)
the participant uses uncertain when it has voted
Yes committed when told the result (uncertain
entries must not be removed from recovery file)
It also requires two additional types of entry

Type of entry Description of contents of entry
Coordinator Transaction identifier, list of participants added by RM when coordinator prepared
Participant Transaction identifier, coordinator added by RM when participant votes yes

52
Log with entries relating to two-phase commit
protocol
Start at end, for U find it is committed and a
participant
We have T committed and coordinator
But if the server has crashed before the last
entry we have U uncertain and participant
or if the server crashed earlier we have U
prepared and participant

entries in log for
T where server is coordinator (prepared comes
first, followed by the coordinator entry, then
committed done is not shown)
and U where server is participant (prepared comes
first followed by the participant entry, then
uncertain and finally committed)
these entries will be interspersed with values of
objects
recovery must deal with 2PC entries as well as
restoring objects
where server was coordinator find coordinator
entry and status entries.
where server was participant find participant
entry and status entries

53
Recovery of the two-phase commit protocol
the most recent entry in the recovery file
determines the status of the transaction at the
time of failure
the RM action for each transaction depends on
whether server was coordinator or participant and
the status

54
Figure 13.23Nested transactions
55
Summary of transaction recovery

Transaction-based applications have strong
requirements for the long life and integrity of
the information stored.
Transactions are made durable by performing
checkpoints and logging in a recovery file, which
is used for recovery when a server is replaced
after a crash.
Users of a transaction service would experience
some delay during recovery.
It is assumed that the servers of distributed
transactions exhibit crash failures and run in an
asynchronous system,
but they can reach consensus about the outcome of
transactions because crashed servers are replaced
with new processes that can acquire all the
relevant information from permanent storage or
from other servers