Title: Controlled concurrency
1Controlled concurrency
- Now we start looking at what kind of concurrency
we should allow - We first look at uncontrolled concurrency and see
what happens - We look at 3 bad examples
- We then look at how we can understand whether
concurrency is OK or not. - Then we look at how to control concurrency
2FIGURE 21.3 (a) The lost update problem.
This occurs when two transactions that access
the same database items have their operations
interleaved in a way that makes the value of some
database item incorrect.
3FIGURE 21.3 (b) The temporary update (dirty read)
problem.
When one transaction updates a database item and
then the transaction fails the updated item is
accessed by another transaction before it is
changed back to its original value
Eg X 20 Y 15 M 2 N 3
- Here issues of concurrency and recovery
4FIGURE 21.3 (c) The incorrect summary problem.
If one T is calculating an aggregate summary
function on a number of records while another T
id updating some of these records, the aggregate
function may calculate some values before they
are updated and others after they are updated.
Eg A 2, N 3, X 10, Y 8
5Serial Schedules
- Serial schedule A schedule S is serial if, for
every transaction T in the schedule, all
operations of T are executed consecutively in S - i.e. all of one T has to finish before another T
starts - Eg T2 T1 T3 is serial
- Otherwise, the schedule is called nonserial or
interleaved schedule - S1 r1(x), w1(x), r2(x), r2(y) serial T1 T2
- S2 r1(x), r2(x), w1(x), r2(y) interleaved
6Concurrency
- How to deal with problems of inconsistency of
data because of concurrency? - Like in the 3 examples we saw earlier
- Only allow serial execution. Problem?
- WastefulT1 is doing I/O, T2 is forced to wait
- Solution Allow controlled concurrency
- Allow when no conflict
- Dont allow when conflict
- Now we see how to do controlled concurrency
7Concurrency Eg Figure 21.5
- Which of C, D should be allowed?
- Eg
- X 50
- M 10
- N 5
8Different serial schedules
- Will 2 diff. serial schedules always give same
results ? - No diff. serial schedules can give diff.
results. Eg - T1 r(x), r(y), x x y, w(x)
- T2 r(x), r(y), y x y, w(y)
- x 20, y 30
- Serial schedule T1T2 final values of X, Y?
- Serial schedule T2T1 final values of X, Y?
- Any serial execution is OK why?
- o/w we should not allow concurrency at all.
- Eg Suppose T1T2 OK, but T2T1 not OK
- All of T1has to happen before all of T2
- Makes no sense to talk about T1 and T2 executing
concurrently
9Serializability
- Implication for concurrent execution?
- Want concurrent schedule equivalent to some
serial schedule - Serializable A schedule S is serializable if it
is equivalent to some serial schedule. - Intuition behind serializability since any
serial execution OK - allow interleaved execution as long as result
will be same as some serial execution. - Eg Fig. 17.5 D OK (equivalent to A), C not OK
10Serializability Result Equivalency
- We said schedule S is serializable if it is
equivalent to some serial schedule. - What does equivalent mean ?
- Check if concurrent schedule produces the same
result as a serial schedule. How ? - First approach pick some data values, try.
- Result equivalent Two schedules are result
equivalent if they produce same final state on
some data - Is this idea OK?
- Saw it with Fig 17.5 Eg
11Serializability Result Equivalency
- Problem could have happened by accident i.e. on
the data we happened to look at, get the same
result but not generally true - Eg Look at Fig 17.5 again
- Any values of X, M, N which will make C produce
same result as A (or B) ? - When M 0
- But C should not be allowed
- Want stronger guarantee. How ?
- Important ops should be in same order as serial
12Conflicting Operations
- Order of some pairs of ops are important to
consider for concurrency/recovery, others not. - Two operations are in conflict When ?
- 1. Belong to different transactions. Why?
- Within T1 cant switch Eg w1(y), r1 (x)
- 2. Access the same data item. Why?
- If diff. data, then doesnt matter
- w1(x), w2 (y) same as w2(y), w1 (x)
- 3. One of them is a write op. . Why?
- r1(x),r2 (x) same as r2(x),r1(x) data unchanged
13Complete Schedules
- Complete Schedule S of T1, T2, Tn
- Exactly same ops in S and T1, T2, Tn
- Includes abort/commit for each Ti
- If op1 before op2 in Ti then same order in S
- For any pair of conflicting operations, one must
occur before other in S - We can leave out internal operations
14Serializability Conflict Equivalent
- Eg S r1(x), r2(y), w1(y), w1(x), w2(x)
- What are the conflict pairs ?
- (r1(x), w2(x))
- (w1(x), w2(x))
- (r2(y), w1(y))
- Conflict Equivalent Two schedules are conflict
equivalent if the order of any two conflicting
operations is the same - i.e. have the same conflict pairs
15Serializability Conflict Equivalent
- Eg T1 r1(x), w1(y), T2 r2(y), w2(x)
- S1 r1(x), r2(y), w2(x), w1(y)
- S2 r2(y), w2(x), r1(x), w1(y)
- Are S1, S2 conflict equivalent ?
- are conflict pairs the same ?
- What are the conflict pairs of S1
- (r1(x), w2(x)), (r2(y), w1(y))
- What are the conflict pairs of S2
- (w2(x)), r1(x)), (r2(y), w1(y))
- Different pairs not conflict equivalent
16Serializability Conflict Equivalent
- Eg S3 r1(x), r2(y), w1(y), w2(x)
- S4 r2(y), r1(x), w1(y), w2(x )
- Are S3, S4 conflict equivalent ?
- are conflict pairs the same ?
- What are the conflict pairs of S3
- (r1(x), w2(x)), (r2(y), w1(y))
- What are the conflict pairs of S4
- (r1(x), w2(x)), (r2(y), w1(y))
- Same pairs are conflict equivalent
17Serializability Eg Figure 21.5
- Which of C, D should be allowed?
18Serializability Conflict Equivalency
- S is conflict serializable if it is conflict
equivalent to some serial schedule S - Figure 17.5 A (T1T2) is serial, so is B (T2T1)
- Is D conflict serializable
- Ds conflict pairs equivalent to those of A or B?
- Conflict pair of A, B, D ?
- A (r1(x), w2(x)), (w1(x), r2(x)), (w1(x), w2(x))
- B (r2(x), w1(x)), (w2(x), r1(x)), (w2(x),w1(x))
- D (r1(x), w2(x)), (w1(x), r2(x)), (w1(x), w2(x))
- Is C conflict serializable. Conflict pairs ?
- C (r1(x), w2(x)), (w1(x), w2(x)), (r2(x), w1(x))
- C not equivalent to A r2(x) before w1(x)
- C not equivalent to B w1(x) before w2(x)
19Serializability
- Serializable not the same as serial.
- What is the difference ?
- Serial means no interleaving T1 T2 T3 etc
- Serializable allows interleaving, but has to be
equivalent to a serial schedule - Serializable schedule
- Will leave the database in a consistent state.
- Interleaving is controlled and will result in the
same state as if the transactions were serially
executed, - Will achieve efficiency due to concurrent
execution.
20Testing For Conflict Serializability
- Testing for conflict serializability
- Algorithm 17.1
- Looks at only read_Item (X) and write_Item (X)
operations not the internal ops - Constructs a precedence graph (serialization
graph) - a graph with directed edges - An edge is created from Ti to Tj if one of the
operations in Ti appears before a conflicting
operation in Tj - The schedule is serializable if and only if the
precedence graph has no cycles.
21Figure 21.5 draw precedence graphs
22FIGURE 21.7 precedence graph for Figure 21.5
- Constructing precedence graphs for schedules from
Figure 17.5 to test for conflict serializability.
Precedence graphs for (a) serial schedule A. (b)
serial schedule B. (c) schedule C (not
serializable). (d) schedule D (serializable,
equivalent to schedule A). - How do we interpret the cycles ?
23FIGURE 21.8 (a).
- Another example of serializability testing. (a)
The READ and WRITE operations of three
transactions T1, T2, and T3. - We will look at schedules in next 2 slides
- And draw the precedence graphs
24FIGURE 21.8 (b).
- Schedule E.
- Precedence graph ? Serializable ?
25FIGURE 21.8 (c).
- Schedule F.
- Precedence graph ? Serializable ?
26Serializability
- Issue OS controls how ops get interleaved
- Resulting schedule may or may not be serializable
- Problem ?
- If not serializable, then what?
- Have to rollback. Problem?
- Expensive not practical! How to solve?
- Guarantee serializability. How ?
- Locks
- Current approach used in most DBMSs
- Two phase locking will study
27View Serializability
- We have seen result equivalent and conflict
equivalent. - View equivalent another condition. RG eg
- Schedule S2 is serial
- Schedule S1 R1(A), W2(A), W1(A), W3(A). Is this
conflict serializable? - No precedence graph has a cycle.
- T1 ? T2 ? T1
- Do you think S1 should be allowed ?
Schedule S1 T1 R(A) W(A) T2 W(A) T3
W(A)
Schedule S2 T1 R(A),W(A) T2
W(A) T3 W(A)
28View Serializability
- S1 is equivalent (in every situation) to serial
S2 i.e. T1,T2,T3. Why? - Because final value of A written by T3
- This is a blind write so does not matter whether
T1, T2 were in serial order or interleaved - Stronger than result equivalent, weaker than
conflict equivalent - View equivalent we wont do formal defn.
- View serializability good enough
- but expensive to test (NP-hard)
- so use conflict serializability since easier to
test
29Other Notions of Serializability
- Other Types of Equivalence of Schedules
- Under special semantic constraints
- schedules that are otherwise not conflict
serializable may work correctly. - SKS Eg in next slide
30SKS Example
- A is checking account
- B is savings account
- T1 transferring 50 from A to B
- T5 transferring 10 from B to A
- Is this schedule conflict serializable?
- No. Also not view serializable
- Though we have not studied definition.
- Should this schedule be allowed ?
- Yes Eg A 100, B 30. In general, OK. Why?
- D debit, C credit. D D C C same as D C D C
31Recoverability vs Serializability
- Both affected by concurrent execution of
transactions, but the two are quite different - Recoverability How to recover if transaction
aborts or system crashes - Serializability Even if no system crashes and
all transactions commit - Have to make sure we get correct results
- Equivalent to serial schedule
32Serializability Tests
- DBMS has to provide a mechanism to ensure that
schedules are conflict serializable - We have seen how to test a schedule to see if it
is (was) serializable. - How can this be used?
- We could run the transactions without attempting
to control concurrency. Then what ? - Test to see if the schedule which resulted was
serializable. If serializable, then what ? - Everything OK. If not serializable, then what ?
- Rollback. Problem ?
- Expensive. Alternative ?
33Concurrency Control vs. Serializability Tests
- Develop concurrency control protocols that only
allow concurrent schedules which we want - Serializable
- Recoverable, cascadeless .
- Connection between concurrency control protocols
and serializability tests ? - Tests for serializability help us understand why
a concurrency control protocol is correct - i.e. why protocol guarantees serializability.