Title: Distributed Transaction Processing
1Distributed Transaction Processing
- presented by Mark Wood, XiuJia Jin, Lin Chen
2Overview of Transactions
- Transaction
- An action or series of actions, carried out by a
single user of application program, which reads
or updates the contents of the database. - A logical unit of work
- Execution of an application one or more
transactions with non-database processing in
between.
3Overview of Transactions
- An example
- Staff (staffNo,fName,lNmae,position,sex,DOB,sala
ry,branchNo) - PropertyForRent (propertyNo, street, city,
postcode, type, rooms, rent, ownerNo,
staffNo,branchNo) - Read (staffNo x, salary)
- Salary salary 1.1
- Write (staffNo x, salary)
- Transaction consists of 2 database operations
(read and write) and a non-database operation
(salarysalary1.1). - A transaction should always transform database
from one consistent state to another. - Allow inconsistency during the middle of the
transaction
4 Transaction Outcomes
- 2 outcomes Committed and Aborted
- Committed
- Transaction completes successfully.
- Aborted
- Transaction does not execute successfully.
- Restore database to a consistent state
- Rolled back or undone ? if transaction is
aborted - Compensating transaction ? if transaction is
committed - Committed transaction can not be aborted!!!
5ACID
- 4 properties
- Atomicity
- All or nothing property
- Indivisible
- DBMSs responsibility
- Consistency
- Transform database from one consistency state to
another - Both DBMS and application developers
responsibility - Isolation
- Transaction execute independently of one another.
- Concurrency control subsystems responsibility.
- Durability
- the results of a successful transaction are
permanently recorded in the database and must not
be lost because of a possible future failure.
6Database Architecture
Transaction manager
Scheduler
Buffer manager
Recovery manager
Access Manager
File manager
Access Manager
System manager
Database and system catalog
7Concurrency Control
- Definition
- The process of managing simultaneous operations
on the database without having them interfere
with one another. - Lost update problem
- A successfully completed transaction is
overwritten by another user - Uncommitted dependency (dirty read) problem
- When one transaction is allowed to see the
intermediate results of another transaction
before it has committed. - Inconsistent analysis problem
- When a transaction reads several values from the
database but a second transaction updates some of
them during the execution of the first.
8The lost update problem
Initial balance 100 A withdraw 10 B deposit
100 To solve prevent A from reading the value
of bal until after Bs update has completed.
9Dirty read problem
A reads a dirty value To solve Prevent A from
reading balx until after the decision has been
made to either commit or abort fo Bs effects.
10Inconsistent analysis problem
To solve Prevent transaction A from reading balx
and baly until after B has completed its updates.
11Other problems
- Nonrepeatable (fuzzy) reads
- When a transaction T rereads a data item it has
previously read but, in between, another
transaction has modified it. So, T reads two
different value of the same item. - Phantom read
- When T executes a query that retrieves a set of
tuples from a relation satisfying a certain
predicate, re-executes the query a later time but
finds that the retrieved set contains a
additional (phantom) tuple that has been inserted
by another transaction in the meantime.
12Serializability
- Serializability a means of helping to identify
those executions of transactions that are
guaranteed to ensure consistency. - Schedule
- A sequence of the operations by a set of
concurrent transactions that preserves the order
of the operations in each of the individual
transactions. - Serial schedule
- A schedule where the operations of each
transaction are executed consecutively without
any interleaved operations from other
transactions. - Nonserial schedule
- A schedule where the operations from a set of
concurrent transactions are interleaved.
13Serializability
- The ordering of read and write operations is
important - If 2 transactions only read a data item, they do
not conflict and order is not important - If 2 transactions either read or write completely
separate data items, they do not conflict and
order is not important - If one transaction writes a data item and another
either reads or writes the same data item, the
order of execution is important.
14Serializability Example
S1 S2 S3
Schedule S3 is a serial schedule. Non-serial
Schedule S1, S2 are equivalent to serial schedule
S3, So S1 and S2 are serializable
15Testing for conflict serializability
- Constrained write rule
- A transaction updates a data item based on its
old value, which is first read by the
transaction. - A procedure graph can be produced to est for
conflict serializability. - Create a node for each transaction
- Create a directed edge Ti ? Tj, if Tj reads the
value of an item written by Ti - Create a directed edge Ti ? Tj, if Tj writes a
value into an item after it has been read by Ti. - Create a directed edge Ti ? Tj, if Tj writes a
value into a item after it has been written by
Ti. - If an edge Ti ? Tj exists in the precedence graph
for S, then in any serial schedule S equivalent
to S, Ti must appear before Ti. If precedence
graph contains a cycle the schedule is not
conflict serializable.
16Example of a precedence graph
x
A
B
y
The transaction is not conflict serializable
17View Serializability
- Less stringent
- A schedule is view serializable if it is view
equivalent to a serial schedule. - Every conflict in a serializable schedule is view
serializable, although the converse is not true. - Testing for View Serializability more complex
than Conflict Serialization - Proven to be NP-complete (Papadimitrio 1979)
-
18View Serializability Example
B and C do not conform to the constrained write
rule B and C perform blind writes This view
serializable schedule is not conflict
serializable
19View Serializability Example Precedence Graph
A
B
This edges should not have been inserted
C
This precedence graph contains a cycle. Thus, it
is not conflict serializable. The view
serializable schedule was equivalent to the
serial schedule A followed by B followed by C.
20Concurrency control techniques
- Goal Allow transactions to execute safely in
parallel. - Concurrency control techniques
- Locking
- Timestamp
- Optimistic method
Main techniques
21Locking
- Locking
- A procedure used to control concurrent access to
data. When one transaction is accessing the
database, a lock may deny access to other
transactions to prevent incorrect results. - Shared lock
- If a transaction has a shared lock on a data
item, it can read the item but not update it. - Exclusive lock
- If a transaction has an exclusive lock on a data
item, it can both read and update the item - Implementation on Data item
- Set a bit on data item
22Granularity of data items
- Granularity the size of data items chosen as the
unit of protection by a concurrency control
protocol. - The entire database
- A file
- A page
- A record
- A field value of a record
- Granularity of data item VS the concurrency
degree. - The best item size depends upon the nature of the
transactions.
Locking info
Concurrency Degree
Granularity
Granularity
23Two-phase locking (2PL)
- 2PL A transaction follows the two-phase locking
protocol if all locking operations precedes the
first unlock operation in the transaction. - Growing phase where the transaction acquires all
the locks needed but cannot release any locks. - Shrinking phase where it releases its locks but
cannot acquire any new locks - Rules
- A transaction must acquire a lock on an item
before operating on the item. The lock may be
read or write, depending on the type of the
access needed. - Once the transaction releases a lock, it can
never acquire any new locks.
24Example of using lock
Get it
252-PL (Continue)
- Advantage
- Guarantee Serializability
- Problems
- Cascading rollbacks
- A single transaction leads to a series of
rollbacks, which potentially lead to the undoing
of a significant amount of work. - Deadlock
- When 2 transactions wait for locks on items held
by the other. - Livelock
- Transactions are left in wait indefinitely.
26An example of cascading rollback
27An example of deadlock
28Timestamping methods
- Timestamp
- A unique identifier created by the DBMS that
indicates the relative starting time of a
transaction. - No locks, no waiting
- Transactions are only rollbacks and restarts
- Timestamping
- A concurrency control protocol that orders
transactions in such a way that older
transactions , transactions with smaller
timestamps, get priority in the event of
conflict. - Timestamps can be for transactions, and for data
items. - Ts(T) T is a transaction
- Read_timestamp(x) x is a data item
- Write_timestamp(x) x is a data item
29Basic timestamp ordering
ts(T19)ltts(T20)ltts(T21)
ts(T19)gtts_w(balx)0
ts(T20)gtts_w(baly)0
ts(T21)gtts_w(baly)0
ts(T20)ltts_r(baly)ts(T21)
ts(T21)gtts_r(baly)ts(T21)? ts(T21)gtts_w(baly)0
restart
ts(T19)ltts(T21)ltts(T20)
read(x) ts(T)ltwrite_timestamp(x) ? abort and
restart ts(T)gt write_timestamp(x) ? proceed
write(x) ts(T)ltread_timestamp(x) ? rollback and
restart ts(T)ltwrite_timestamp(x) ? rollback and
restart Otherwise proceed
30Database Recovery
- Recovery manager guarantees 2 of 4 ACID
properties - Atomicity and Durability
- Problem writing is not atomic
- The transaction is committed, but effects have
not reached to database yet, i.e. data updates
have not been recorded in the database.
Database (on secondary storage)
Database buffer (in Main Memory)
flushing
- Flushing buffer writes data back to secondary
storage. - Force-writing explicit writing of the buffers to
secondary storage. - Update operation is regarded permanent only after
flushing.
31Buffer management
- Buffer replacement strategy FIFO, LRU
- 2 policies when a page is written back to disk
- Steal policy (or no-steal)
- Write a buffer to disk before a transaction
commits. - Force policy (or no-force)
- All pages are written to disk when the
transaction commits. - A steal, no-force policy
- steal avoids the need for a very large buffer
space to store all updated pages - No-force avoids rewriting a page to disk for a
later transaction that has been updated by an
earlier committed transaction and may still be in
a database buffer. - Used by most DBMS.
32Recovery facilities
- Backup mechanism
- Makes periodic backup copies of the database
- Logging facilities
- Keeps track of the current state of transaction
and database changes - A checkpoint facility
- Enables updates to the database that are in
progress to be made permanent. - A recovery manager
- Allows the system to restore the database to a
consistent state following a failure.
33Log file (log or journal)
- Log files
- Transaction records
- Checkpoint records
- Important!!! Log files are duplicated or
triplexed.
34Checkpointing
- Problem how far back do we need to look in the
log file? - Checkpoint can eliminate the searching time
- Checkpoint
- The point of synchronization between the database
and the transaction log file. All buffers are
force-written to secondary storage. - Checkpoint operation
- Write all log records in main memory to secondary
storage - Write the modified blocks in the database buffers
to secondary storage - Write a checkpoint record to the log file. This
record contains the identifiers of all
transactions that are active at the time of the
checkpoint.
35Use of UNDO/REDO
t0
t1
DBMS starts at t0, fails at t1 DBMS Action UNDO
T1, T6 REDO T2, T3, T4, T5
36Use of UNDO/REDO with checkpoint
t0
tc
t1
DBMS starts at t0, fails at t1, a checkpoint at
tc DBMS Action UNDO T1, T6 REDO T4, T5
No need to REDO T2, T3 (before tc)
37Recovery techniques
- Recovery depends on how bad the damage is.
- Database storage is physically damaged. (disk
head crashed) - Database is in inconsistent state.
- Recovery techniques
- Deferred update
- Immediate update
- Shadow paging
Our concentration
38Deferred update
- Updates are not written to the database until
after a transaction has reached its commit point. - In case of failure before commit,
- UNDO is unnecessary
- REDO is necessary
- Use log file to do following
- When a transaction starts, write a transaction
start record to the log - When any write operation is performed, write a
log record containing all the log data specified
previously. Do not write the update to the
database buffers or the database itself - When a transaction is about to commit, write a
transaction commit log record, write all the log
records for the transaction to disk, and then
commit the transaction - If a transaction aborts, write a transaction
abort log record, ignore the log records for the
transaction and do not perform the writes.
39How to recover using deferred update
- Starting at the last entry in the log file, go
back to the most recent checkpoint record - Any transaction with transaction start and
transaction commit log records should be redone. - Use the after-image log records for the
transaction, - In the order in which they were written to the
log - For any transactions with transaction start and
transaction abort log records, we do nothing
since no actual writing was done to the database. - No transactions need to be undone.
40Immediate update
- Using this approach, updates are applied to the
database as they occur without waiting to reach
the commit point. - In case of failure
- Both UNDO and REDO are necessary
- Use log file to do following
- When a transaction starts, write a transaction
start record to the log - When any write operation is performed, write
record containing the necessary data to the log
file - Once the log record is written, write the update
to the database buffers. - The updates to the database itself are written
when the buffers are next flushed to secondary
storage - When the transaction commits, write a transaction
commit record to log.
41How to recover using immediate update
- For any transactions for which both a transaction
start and transaction commit record appear in the
log, we redo using the log records to write the
after-image of updated fields. - For any transactions for which the log contains a
transaction start record but not a transaction
commit record, we need to undo that transaction. - The log records are used to write the
before-image of the affected fields, and restore
the database to its state prior to the
transactions start. - In the reverse order to which they were written
to the log
42Nested Transaction Model
Nested transaction model A transaction is viewed
as a collection of related subtasks, or
subtransactions, each of which may also contain
any number of subtransactions
T2
T2
T3
T1
T5
T6
43Distributed DBMSs
- Distributed database
- A logically interrelated collection of shared
data (and a description of this data) physically
distributed over a computer network. - Distributed DBMS
- The software system that permits the management
of the distributed database and makes the
distribution transparent to users. - Distributed processing
- A centralized database that can be accessed over
a computer network
44DDBMS and Distributed Processing
- Data is physically distributed.
- Consists of a single logical database that is
split into a number of fragments. - It is not necessary for every site in the system
to have its own local database
- Data is centralized
- Users may access the data over the network
45Distributed Transaction
- A distributed transaction accesses data stored at
more than one location. - Each transaction is divided into a number of
subtransactions - A subtransaction is represented by an agent
- Transaction transparency
46Transaction Transparency
- Two aspects of transaction transparency
- Concurrency transparency
- All concurrent transactions execute independently
and are logically consistent with the result that
are obtained if the transactions are executed one
at a time. - DDBMS must ensure that both global and local
transactions do not interfere with each other. - Failure transparency
- DDBMS must ensure the atomicity of the global
transaction.
47Distributed Transaction Management
- Must ensure the atomicity of the global
transaction and each component subtransaction. - Four high-level database modules that handle
transactions, concurrency control, and recovery
in a centralized DBMS also exist in each local
DBMS in a DDBMS. - Transaction manager, scheduler, recovery manager,
and buffer manager. - In addition, there is also a global transaction
manager or transaction coordinator at each site
to coordinate the execution of both the global
and local transactions initiated at that site. - Inter-site communication is through the data
communications component.
48Distributed Transaction Management
- TC1 divides the transaction into subtransactions
- The data communication component at S1 sends the
subtransactions to the appropriate sites, say, S2
and S3. - The transaction coordinators at S2 and S3 manage
the subtransactions. - The result of the subtransactions are sent back
to TC1 via data communication components
49Distributed Concurrency Control
- A good concurrency control mechanism for DDBMS
should - resilient
- permit parallelism
- incur modest overhead
- perform satisfactorily
- place few constraints
- Additional problems arise as a result of data
distribution. - Multiple-copy consistency problem
50Distributed Serializability
- If the schedule of transaction execution at each
site is serializable, then the global schedule
(the union of all local schedules) is also
serializable provided local serialization orders
are identical. - All subtransaction should appear in the same
order in the equivalent serial schedule at all
sites.
- The concurrency control solutions are based on
locking and timestamping.
51Locking Protocols
- Four types of locking protocol for distributed
DBMSs - Centralized 2PL
- Primary copy 2PL
- Distributed 2PL
- Majority locking
52Centralized 2PL
- A single site maintains all locking information
- The coordinator requests exclusive locks on all
copies before updating each copy. - Local transaction managers request and release
locks from the centralized lock manager using the
normal 2PL rules. - Only one scheduler/lock manager
- Advantage and Disadvantage
- Advantage
- The implementation is straightforward
- Communication costs are relatively low.
- Disadvantage
- The centralization in a distributed DBMS causes
bottlenecks and low reliability. - The failure of the central site would cause major
system failures.
53Primary Copy 2PL
- Attempts to overcome the disadvantage of
centralized 2PL protocol. - Distributes lock manager to a number of sites.
- For each replicated data item, one copy is chosen
as the primary copy, and the others are slave
copies. - The site chosen to manage the locks for the
primary copy does not have to hold the primary
copy. - This protocol only guarantees the primary copy is
current. - Advantages and Disadvantages
- Advantages
- Lower communication costs
- Better performance than centralized 2PL (less
remote locking) - Disadvantages
- Deadlock handling is more complex (multiple lock
managers). - Still has a degree of centralization (lock
request for a specific primary copy can only be
handled by one site)
54Distributed 2PL
- Also attempts to overcome the disadvantages of
centralized 2PL. - Distributes the lock managers to every site.
- Each lock manager is responsible for managing the
locks for the data at that site. - Equivalent to primary copy 2PL if the data is not
replicated. - Otherwise, it implements a ROWA
(Read-One-Write-All) replica control protocol. - Any copy of a replicated item can be used for a
read operation - But all copies must be exclusively locked before
an item can be updated
55Distributed 2PL
- Advantage and Disadvantage
- Advantage
- Deals with locks in a decentralized manner, thus
avoiding the drawbacks of centralized control. - Disadvantage
- Deadlock handling is more complex.
- Communication costs are higher than primary copy
2PL since all items must be locked before update.
56Majority Locking
- An extension of distributed 2PL
- Overcome having to lock all copies of a
replicated item before an update. - Also maintains a lock manager at each site.
- How majority locking works
- If a transaction wishes to read or write a data
item that is replicated at n sites, it must
obtain locks on a majority of the copies (by
sending lock request to more than n/2 sites). - Any number of transactions can simultaneously
hold a shared lock on a majority of the copies. - Only one can hold an exclusive lock on a majority
of the copies.
57Majority Locking
- Advantage and Disadvantage
- Advantage
- Overcomes the drawbacks of centralized control
- Disadvantage
- Deadlock handling is more complicated
- This protocol is overly strong in the case of
shared locks
58Distributed Timestamp Protocols
- Need to generate unique timestamps both locally
and globally. - General approach is to use the concatenation of
the local timestamp with a unique site identifier - ltlocal timestamp, site identifiergt (Lamport,
1978) - Sites synchronize their timestamps to prevent a
busy site generating larger timestamps than
slower sites. - Example
- S1 S2
- lt10, 1gt lt15, 2gt --site 2 would not change its
timestamp - lt10, 1gt lt5, 2gt --site 2 would change its
timestamp to - lt11, 2gt
send
send
59Distributed Deadlock Management
- Deadlock detection may be more complicated in
distributed environment if lock management is not
centralized. - Example
60Centralized Deadlock Detection
- A single site which is appointed as the Deadlock
Detection Coordinator (DDC) has the
responsibility of constructing and maintaining
the global Wait-For Graph (WFG). - Each lock manager transmits its local WFG to the
DDC periodically. - The DDC checks the cycle in the global WFG and
break each cycle by selecting the transactions to
be rollback and restarted. - The local lock managers only send the change that
have occurred in the local WFG since it sent the
last time. - The failure of the central site would cause
problems.
61Hierarchical Deadlock Detection
- The sites in the network are organized into a
hierarchy - Each site sends its local WFG to the deadlock
detection site above it in the hierarchy. - The root of the tree is a global deadlock
detector that would detect deadlock between all
sites. - Advantage
- Reduces the dependence on a centralized detection
site. - Disadvantage
- Much more complex to implement, especially in the
presence of site and communication failures.
62Distributed Deadlock Detection
- An external node Text is added to each local WFG.
- Example
Text-gtT2-gtT3-gtText
Text-gtT3-gtT1-gtText
Text-gtT1-gtT2-gtText
Text-gtT3-gtT1-gtT2 -gtT3-gtText
Text-gtT3-gtT1-gtT2-gtText
- The global WFG contains a cycle that does not
involve the external node, so we could conclude
that the deadlock exists