Title: 16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates
116. Concurrency Control and Recovery(only for
DBs with updates..!)- Review
- Concurrency Control
- Transaction
- ACID
- Isolation
- Schedules
- Guaranteeing isolation
- Serializability
- Serializability ? Isolation
- Locking
- Strict Two Phase Locking
- Strict 2PL ?Serializable
2Learning Objectives
- Define ACID, schedule, isolated, equivalent,
serializable, S2PL, conflict serializable,
precedence graph, recoverable. - Know the implications on slide 25 and when the
converses hold - Explain lock management, multiple granularity
locks, phantoms, locking in BTrees, optimistic
concurrency control
3Example Transaction
- Transfer 100 from A to B
- Read A Verify A Write A-100 then
- Read B Verify B Write B100
- Are all 6 steps necessary?
- Which steps require disc access?
- When can an abort occur without damage?
- Write is as in a programs write
- What damage can an abort cause?
- How can you avoid such damage?
4Transaction (cont.)
- User (application developer) must indicate
- Begin transaction
- read/write/modify statements intermixed with
other programming language statements - plus either
- commit - indicates successful completion or
- abort - indicates program wants to roll back
(erase the transaction) - All or nothing! (Atomic)
5Supporting the ACID Properties of Transactions
- Atomicity All actions in a transaction happen
in their entirety or not at all. - Consistency If the DB starts in a consistent
state, (this notion is defined by the user
some of it may be enforced by integrity
constraints) and if a transaction executes with
no other queries active, then the DB ends up
in a consistent state. - Isolation Each transaction is isolated from
other transactions. The effect on the DB is
as if each transaction executed by itself. - Durability If a transaction commits, its changes
to the database state persist.
Recovery System
Programmers
Concurrency Control System
Recovery System
6Isolation/Concurrency Control
T1 BEGIN A100, B-100 END T2 BEGIN
A1.06A, B1.06B END
- What is each of these transactions doing?
- A schedule of T1 and T2 is an interleaving of the
steps of these transactions so that each
transactions order is preserved.
7Which of these is a Schedule of T1 and T2?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100, B-100 T2
B1.06B, A1.06A
T1 A100,
B-100 T2 A1.06A,B1.06B
8Isolated Schedules
- A schedule is isolated if its effect on the DB is
as if each transaction executed by itself,
serially. - Which of the schedules on the next page is
isolated? - Hint Calculate the effect of the schedule on a
sample state of the DB, for example A has 1,000,
B has 500. This wont tell you the effect on
all states, but its helpful information.
9Which Schedules are Isolated?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
T1 A100,B-100 T2
A1.06A, B1.06B
- Goal of Concurrency Control subsystem Guarantee
only isolated schedules.
10Equivalent Schedules
- Two schedules are equivalent if given any
starting DB state, they produce the same result. - Which of these schedules is equivalent?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
11Serializable Schedules
- A schedule is serializable if it is equivalent
to a serial schedule. - Which of these schedules is serializable?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
12The goal of Concurrency Control
- Recall the goal of concurrency control To ensure
that all schedules are isolated - Theorem A schedule is Serializable ? it is
Isolated - ? Serializable ? equivalent to some serial
schedule, and in a serial schedule, each Xact. is
isolated - ? If each xact runs alone, the schedule must be
serial - But serializability is hard to verify
- How can we, in real time, check each schedule?
- So the Concurrency Control Subsystem needs more
work.
13Locking
- Transaction must get a lock before it can read
or update data - There are two kinds of locks shared (S) locks
and exclusive (X) locks - To read a record you MUST get an S lockTo modify
or delete a record you MUST get an X lock - Lock info maintained by a lock manager
14How Locks Work
- If a Xact has an S lock on a data object , new
transactions can get S locks on that object, but
not X locks. - If a Xact has an X lock, no other Xact can get
any lock (S or X) on that data object. - If a transaction cant get a lock, it waits (in
a queue).
lock on data item
lock you want
Lock compatibility
15Strict Two Phase Locking Protocol (S2PL)
- Strict 2PL is a way of managing locks during a
transaction - A Xact gets (S and X) locks gradually, as needed
- The Xact holds all locks until end of transaction
(commit/abort)
All locks are released at the end, upon commit
or abort
5
of locks held by a transaction T
4
3
2
1
0
time ?
16Strict 2PL guarantees serializability
- Idea of the Proof a Strict 2PL schedule is
equivalent to the serial schedule in which each
transaction runs instantaneously at the time that
it commits - This is huge A property of each transaction
(S2PL) implies a property of any set of
transactions (serializability) - No need to check serializability of any schedules
- Real DBMSs use S2PL to enforce serializability
- In reality, users can and do choose lower levels
of concurrency for all but the most sensitive
transactions
1717. Concurrency Control
- Conflicts
- Conflicting Actions
- Conflict Equivalent
- Conflict Serializable
- Conf. Ser. ? Serializable
- Precedence Graph
- Conf. Serializable ?Precedence graph is acyclic
- Strict 2PL ?Recoverable
- 2PL ? Recoverable
- Locks
- Management
- Deadlocks
- Waits-for
- Multiple Granularity
- Phantoms
- Predicate, Index locking
- Locking in B Trees
- Optimistic CC
- Inefficiency of locking
- Optimistic CC idea
1817.1 Conflict Serializable Schedules
17. CC
- Conflicting actions Actions that access the same
data and at least one of which is a write - Note that changing the order of these two actions
might yield different results. - Two schedules are conflict equivalent if
- They involve the same actions of the same
transactions in the same order - Every pair of conflicting actions is ordered the
same way - Schedule S is conflict serializable if S is
conflict equivalent to some serial schedule
19Which are conflict serializable?
17. CC
T1 R(A),W(A), R(B),W(B)
T2 R(A),W(A), R(B),W(B)
T1 R(A), W(A) T2 R(A),
W(A), R(B)
T1 R(B), W(A), W(B) T2
R(A), W(A), R(B)
T1 R(A), W(A) T2 W(A) T3
W(A)
20Conflict Serializable ? Serializable
- If two actions do not conflict, then commuting
them results in an equivalent schedule. - Suppose S is conflict serializable. Then there
is a sequence of commuting actions I I1,,In
so that - Each of the Ii commutes nonconflicting actions
- I applied to S is a serial schedule
- Because of (a), I does not change the state of
any database. Thus S, and I applied to S, are
equivalent and I applied to S is serial (b), so S
is serializable.
21Serializable does NOT imply Conflict Serializable
T1 R(A), W(A) T2 W(A) T3
W(A)
- Equivalent to what serial schedule?
- Therefore it is a serializable schedule
- Why is it not conflict serializable? (for now
just give an intuitive reason, later we will have
a proof)
22Precedence graphs
17. CC
- Why is this graph not conflict serializable?
- The cycle in the graph illustrates the problem.
T1 must precede T2, and T2 must precede T1, in
any conflict equivalent serial schedule.
T1 R(A), W(A), R(B), W(B) T2
R(A), W(A), R(B), W(B)
23Precedence Graph
17. CC
- Precedence graph One node per Xact edge from
Ti to Tj if an action of Ti precedes and
conflicts with an action of Tj. - Theorem Schedule is conflict serializable if and
only if its precedence graph is acyclic - ?If there is a cycle in the graph, it cannot be
serializable (see previous page generalize) - ? If the graph is acyclic, the schedule is
equivalent to a topologically sorted order of the
actions.
24Example of acyclic graph
- Is this graph acyclic?
- What is a topological sort of it?
- Is a schedule, for which this is a precedence
graph, equivalent to a serial schedule? - Can we move all actions of T4 to occur before T2,
without reversing conflicting actions? - How about T1 before T4?
25Summary
Isolated Xact same results as if it ran alone
Each Xact in a schedule is Isolated
?
??
Serializable Schedule Same result as a serial
schedule
?
The schedule is Serializable
?
The schedule is Conflict Serializable
Conflict Serializable Conflict Equivalent to a
Serializable Schedule
?
??
The Schedules Precedence Graph is Acyclic
?
The schedule is consistent with Strict 2PL.
Strict 2PL There is a locking schedule where all
locks are held until EOT
?
?
?
Deadlock There is a cycle in the Waitsfor graph.
Deadlock is possible
26 Strict 2PL?Recoverable
- A schedule is recoverable if, during it, all
transactions commit only after all transactions
whose data they have read commit. - Why is recoverability desirable? Otherwise, T1
may read the data of T2 ( a so-called dirty
read), then T1 commit, then T2 abort and roll
back. Then T1 has read a value that does not
exist.
27Two-Phase Locking (2PL)
17. CC
- Two-Phase Locking Protocol
- Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on
object before writing. - A transaction can not request additional locks
once it releases any locks. - 2PL implies that all schedules have acyclic
precedence graphs, so are serializable. - However, they are not recoverable, so Strict 2PL
is used in practice.
28Lock Management
17. CC
- Lock and unlock requests are handled by the lock
manager - Lock table entry
- IDs of transactions currently holding a lock
- Type of lock held (shared or exclusive)
- Pointer to queue of lock requests
- If there is an S lock on an object O and T1
requests an X lock, what happens? What if then
T2 requests an S lock? - Locking and unlocking have to be atomic
operations - How is this enforced?
- Lock upgrade transaction that holds a shared
lock can be upgraded to hold an exclusive lock if
no one else has a shared lock.
29Managing a new lock (simplified)
New Lock
Type of lock?
S
X
Y
Queue empty?
? lock?
EnQ
N
N
Y
Grant X lock
EnQ
? lock?
N
Y
Grant S lock
Type of lock?
S
X
EnQ
30Managing a lock release (simplified)
Release Lock
Y
Exit
? Other locks?
N
DeQ xact from Q, give it a lock
Y
Is there an S lock on top of the Q?
S
What type of lock was it?
N
X
Exit
31Deadlocks
17. CC
- Deadlock Cycle of transactions waiting for locks
to be released by each other. - Two ways of dealing with deadlocks
- Deadlock prevention
- Deadlock detection
32Deadlock Prevention
17. CC
- Theory
- Assign priorities based on timestamps.
- Older transactions get higher priority.
- Assume Ti wants a lock that Tj holds. Two
policies are possible - Wait-Die It Ti has higher priority, Ti waits for
Tj otherwise Ti aborts - Wound-wait If Ti has higher priority, Tj aborts
otherwise Ti waits - If a transaction re-starts, make sure it has its
original timestamp - Practice http//dev.mysql.com/doc/refman/5.0/en/i
nnodb-deadlocks.html
33Deadlock Detection
17. CC
- Create a waits-for graph
- Nodes are transactions
- There is an edge from Ti to Tj if Ti is waiting
for Tj to release a lock - Periodically check for cycles in the waits-for
graph - Note that waits-for graph is opposite direction
of precedence graph.
34Deadlock Detection (Continued)
17. CC
- Example
- T1 S(A), R(A), S(B)
- T2 X(B),W(B) X(C)
- T3 S(C), R(C) X(A)
- T4 X(B)
T1
T2
T4
T3
35Multiple-Granularity Locks
17. CC
- Hard to decide what granularity to lock (tuples
vs. pages vs. tables). - Shouldnt have to decide!
- Data containers are nested
contains
36Solution New Lock Modes, Protocol
17. CC
- Allow Xacts to lock at each level, but with a
special protocol using new intention locks
- Before locking an item, Xact must set intention
locks on all its ancestors. - IX(IS) Intend to X(S) lock a subset.
- SIX S IX at the same time. Used to scan and
update selected records.
37Multiple Granularity Lock Protocol
17. CC
- Each Xact starts from the root of the hierarchy.
- To get S or IS lock on a node, must hold IS or IX
on parent node. - To get X or IX or SIX on a node, must hold IX or
SIX on parent node. - Must release locks in bottom-up order.
- Sometimes hard to decide granularity of locks.
Can start small and use lock escalation.
38Examples
17. CC
- T1 scans R, and updates a few tuples
- T1 gets an SIX lock on R, then repeatedly gets an
S lock on tuples of R, and occasionally upgrades
to X on the tuples. - T2 uses an index to read only part of R
- T2 gets an IS lock on R, and repeatedly
gets an S lock on tuples of R. - T3 reads all of R
- T3 gets an S lock on R.
- OR, T3 could behave like T2 can
use lock escalation to decide
which.
39Dynamic Databases Phantoms
17. CC
Name Rank Age
Pehr 1 25
John 2 26
Lorr 2 23
- If we allow updates, even Strict 2PL will not
assure serializability - T1 finds oldest sailor in each rank
- T2 inserts(Rohi,1,27) and deletes John
- Schedule is
-
- This schedule is Strict 2PL, but not
serializable! - Result of this schedule is (1,Pehr)(2,Lorr)
- Result of T1T2 is (1,Pehr)(2,John)
- Result of T2T1 is (1,Rohi)(2,Lorr)
T1 rank 1
T1 rank 2
T2 inserts Rohi, deletes John
40The Problem
17. CC
- When T1 retrieved the oldest sailor of rank 1, it
locked each sailor of rank 1 with a read lock. - None of these locks applied to the new record (a
phantom) inserted by T2. - We need a mechanism to prevent phantoms to allow
T1 to lock present and future sailors with rank
1. - There are two such mechanisms, index locking and
predicate locking.
41Index Locking
Data
Index
r1
- If there is a dense index on the rating field
using Alternative (2), T1 should lock the index
page(s) containing the data entries with rating
1. - If there are no records with rating 1, T1 must
lock the index page where such a data entry would
be, if it existed! - If there is no suitable index, T1 must lock all
pages, and lock the file/table to prevent new
pages from being added, to ensure that no new
records with rating 1 are added.
42Predicate Locking
17. CC
- Grant lock on all records that satisfy some
logical predicate, e.g. age gt 2salary. - Index locking is a special case of predicate
locking for which an index supports efficient
implementation of the predicate lock. - What is the predicate in the sailor example?
- In general, predicate locking has a lot of
locking overhead.
43Locking in B Trees
17. CC
- How can we efficiently lock a B tree?
- Btw, dont confuse this with multiple granularity
locking! - One solution Ignore the tree structure, just
lock pages while traversing the tree, following
2PL. - This has terrible performance!
- Root node (and many higher level nodes) become
bottlenecks because every tree access begins at
the root. This single threads all updates to the
tree.
44Two Useful Observations
17. CC
- Higher levels of the tree only direct searches
for leaf pages. - For inserts/deletes, a node on a path from root
to modified leaf must be locked (in X mode, of
course), only if a split can propagate up to it
from the modified leaf. - We can exploit these observations to design
efficient locking protocols that guarantee
serializability even though they violate 2PL.
45A Tree Locking Algorithm
17. CC
- Search Start at root and go down repeatedly, S
lock child then unlock parent. - Insert/Delete Start at root and go down,
obtaining X locks as needed. Once child is
locked, check if it is safe - If child is safe, release all locks on ancestors.
- Safe node Node such that the change will not
propagate up beyond this node. - Inserts Node is not full.
- Deletes Node is not half-empty.
46Example
ROOT
Do 1) Search 38 2) Delete 38 3) Insert
45 4) Insert 25
A
20
35
B
35
23
38 44
35
35
C
F
38
44
23
H
D
E
G
I
20
22
23
24
35
36
38
41
44
47Optimistic CC (Kung-Robinson)
17. CC
- Locking is a conservative approach in which
conflicts are prevented. Disadvantages - Lock management overhead.
- Deadlock detection/resolution.
- Lock contention for heavily used objects.
- If conflicts are rare, we might be able to gain
concurrency by not locking, and instead checking
for conflicts before Xacts commit. - A version of this optimistic approach is used by
PostgreSQL and Oracle
48Kung-Robinson Model
17. CC
- Xacts have three phases
- READ Xacts read from the database, but make
changes to private copies of objects. - VALIDATE Check for conflicts.
- WRITE Make local copies of changes public.
old
ROOT
modified objects
new
49Validation
17. CC
- Each Xact is assigned a numeric id.
- Just use a timestamp.
- Xact ids assigned at end of READ phase, just
before validation begins. - ReadSet(Ti) Set of objects read by Xact Ti.
- WriteSet(Ti) Set of objects modified by Ti.
50Test 1
17. CC
- For all i and j such that TSi lt TSj, check that
Ti completes before Tj begins.
Ti
Tj
R
V
W
R
V
W
51Test 2
17. CC
- For all i and j such that Ti lt Tj, check that
- Ti completes before Tj begins its Write phase
- WriteSet(Ti) ReadSet(Tj) is empty.
Ti
R
V
W
Tj
R
V
W
Does Tj read dirty data? Does Ti overwrite Tjs
writes?
52Test 3
17. CC
- For all i and j such that Ti lt Tj, check that
- Ti completes Read phase before Tj does
- WriteSet(Ti) ReadSet(Tj) is empty
- WriteSet(Ti) WriteSet(Tj) is empty.
Tj
R
V
W
Does Tj read dirty data? Does Ti overwrite Tjs
writes?
53Overheads in Optimistic CC
17. CC
- Must record read/write activity in ReadSet and
WriteSet per Xact. - Must create and destroy these sets as needed.
- Must check for conflicts during validation, and
must make validated writes global. - Critical section can reduce concurrency.
- Scheme for making writes global can reduce
clustering of objects. - Optimistic CC restarts Xacts that fail
validation. - Work done so far is wasted requires clean-up.