Title: Quick Review of Apr 29 material
1Quick Review of Apr 29 material
- Transformation of Relational Expressions
- Equivalence Rules
- Transactions
- ACID properties (Atomic, Consistent, Isolated,
Durable) - Transaction States (active, partially committed,
failed, committed, aborted) - Concurrent Execution and Serializability
2Serial and Interleaved Schedules
- Serial schedule Interleaved schedule
3Another Interleaved Schedule
- The previous serial and interleaved schedules
ensured database consistency (AB before
execution AB after execution) - The interleaved schedule on the right is only
slightly different, but does not ensure a
consistent result. - assume A1000 and B2000 before (sum 3000)
- after execution, A950 and B2150 (sum 3100)
4Inconsistent Transaction Schedules
- So what caused the problem? What makes one
concurrent schedule consistent, and another one
inconsistent? - Operations on data within a transaction are not
relevant, as they are run on copies of the data
residing in local buffers of the transaction - For scheduling purposes, the only significant
operations are read and write - Given two transactions, Ti and Tj, both
attempting to access data item Q - if Ti and Tj are both executing read(Q)
statements, order does not matter - if Ti is doing write(Q), and Tj read(Q), then
order does matter - same if Tj is writing and Ti reading
- if both Ti and Tj are executing write(Q), then
order might matter if there are any subsequent
operations in Ti or Tj accessing Q
5Transaction Conflicts
- Two operations on the same data item by different
transactions are said to be in conflict if at
least one of the operations is a write - If two consecutive operations of different
transactions in a schedule S are not in conflict,
then we can swap the two to produce another
schedule S that is conflict equivalent with S - A schedule S is serializable if it is conflict
equivalent (after some series of swaps) to a
serial schedule.
6Transaction Conflict Example
- Example read/write(B) in T0 do not conflict with
read/write(A) in T1 - T0 T1 T0 T1
- read(A) read(A)
- write(A) write(A)
- read(A) read(B)
- read(B) write(B)
- write(A) read(A)
- write(B) write(A)
- read(B) read(B)
- write(B) write(B)
7Serializability Testing (15.9) and Precedence
Graphs
- So we need a simple method to test a schedule S
and discover whether it is serializable. - Simple method involves constructing a directed
graph called a Precedence Graph from S - Construct a precedence graph as follows
- a vertex labelled Ti for every transaction in S
- an edge from Ti to Tj if any of these three
conditions holds - Ti executes write(Q) before Tj executes read(Q)
- Ti executes read(Q) before Tj executes write(Q)
- Ti executes write(Q) before Tj executes write(Q)
- if the graph has a cycle, S is not serializable
8Precedence Graph Example 1
- Compute a precedence graph for schedule B (right)
- three vertices (T1, T2, T3)
- edge from Ti to Tj if
- Ti writes Q before Tj reads Q
- Ti reads Q before Tj writes Q
- Ti writes Q before Tj writes Q
9Precedence Graph Example 1
- Compute a precedence graph for schedule B (right)
- three vertices (T1, T2, T3)
- edge from Ti to Tj if
- Ti writes Q before Tj reads Q
- Ti reads Q before Tj writes Q
- Ti writes Q before Tj writes Q
10Precedence Graph Example 2
- Slightly more complicated example
- Compute a precedence graph for schedule A (right)
- three vertices (T1, T2, T3)
- edge from Ti to Tj if
- Ti writes Q before Tj reads Q
- Ti reads Q before Tj writes Q
- Ti writes Q before Tj writes Q
11Precedence Graph Example 2
- Slightly more complicated example
- Compute a precedence graph for schedule A (right)
12Concurrency Control
- So now we can recognize when a schedule is
serializable. In practice, it is often difficult
and inefficient to determine a schedule in
advance, much less examine it for
serializability. - Lock-based protocols are a common system used to
prevent transaction conflicts on the fly (i.e.,
without knowing what operations are coming later) - Basic concept is simple to prevent transaction
conflict (two transactions working on the same
data item with at least one of them writing), we
implement a lock system -- a transaction may only
access an item if it holds the lock on that item.
13Lock-based Protocols
- We recognize two modes of locks
- shared if Ti has a shared-mode (S) lock on
data item Q then Ti may read, but not write, Q. - exclusive if Ti has an exclusive-mode (X) lock
on Q then Ti can both read and write Q. - Transactions be granted a lock before accessing
data - A concurrency-control manager handles granting of
locks - Multiple S locks are permitted on a single data
item, but only one X lock - this allows multiple reads (which dont create
serializability conflicts) but prevents any R/W,
W/R, or W/W interactions (which create conflicts)
14Lock-based Protocols
- Transactions must request a lock before accessing
a data item they may release the lock at any
time when they no longer need access - If the concurrency-control manager does not grant
a requested lock, the transaction must wait until
the data item becomes available later on. - Unfortunately, this can lead to a situation
called a deadlock - suppose T1 holds a lock on item R and requests a
lock on Q, but transaction T2 holds an exclusive
lock on Q. So T1 waits. Then T2 gets to where
it requests a lock on R (still held by waiting
T1). Now both transactions are waiting for each
other. Deadlock.
15Deadlocks
- To detect a deadlock situation we use a wait-for
graph - one node for each transaction
- directed edge Ti --gt Tj if Ti is waiting for a
resource locked by Tj - a cycle in the wait-for graph implies a deadlock.
- The system checks periodically for deadlocks
- If a deadlock exists, one of the nodes in the
cycle must be aborted - 95 of deadlocks are between two transactions
- deadlocks are a necessary evil
- preferable to allowing the database to become
inconsistent - deadlocks can be rolled back inconsistent data
is much worse
16Two-phase Locking Protocol
- A locking protocol is a set of rules for placing
and releasing locks - a protocol restricts the number of possible
schedules - lock-based protocols are pessimistic
- Two-phase locking is (by far) the most common
protocol - growing phase a transaction may only obtain
locks (never release any of its locks) - shrinking phase a transaction may only release
locks (never obtain any new locks)
17Two-phase with Lock Conversion
- Two-phase with lock conversion
- S can be upgraded to X during the growing phase
- X can be downgraded to S during the shrinking
phase (this only works if the transaction has
already written any changed data value with an X
lock, of course) - The idea here is that during the growing phase,
instead of holding on X on an item that it
doesnt need to write yet, to hold an S lock on
it instead (allowing other transactions to read
the old value for longer) until the point where
modifications to the old value begin. - Similarly in the shrink phase, once a transaction
downgrades an X lock, other transactions can
begin reading the new value earlier.
18Variants on Two-phase Locking
- Strict two-phase locking
- additionally requires that all X locks are held
until commit time - prevents any other transactions from seeing
uncommitted data - viewing uncommitted data can lead to cascading
rollbacks - Rigorous two-phase locking
- requires that ALL locks are held until commit time