16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates - PowerPoint PPT Presentation

About This Presentation

Title:

16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates

Description:

Is a schedule, for which this is a precedence graph, equivalent to a serial schedule? ... that all schedules have acyclic precedence graphs, so are serializable. ... – PowerPoint PPT presentation

Number of Views:118

Avg rating:3.0/5.0

Slides: 54

Provided by: RaghuRamak

Learn more at: http://web.cecs.pdx.edu

Category:

more less

Transcript and Presenter's Notes

Title: 16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates

1
16. Concurrency Control and Recovery(only for
DBs with updates..!)- Review

Concurrency Control
Transaction
ACID
Isolation
Schedules
Guaranteeing isolation
Serializability
Serializability ? Isolation
Locking
Strict Two Phase Locking
Strict 2PL ?Serializable

2
Learning Objectives

Define ACID, schedule, isolated, equivalent,
serializable, S2PL, conflict serializable,
precedence graph, recoverable.
Know the implications on slide 25 and when the
converses hold
Explain lock management, multiple granularity
locks, phantoms, locking in BTrees, optimistic
concurrency control

3
Example Transaction

Transfer 100 from A to B
Read A Verify A Write A-100 then
Read B Verify B Write B100
Are all 6 steps necessary?
Which steps require disc access?
When can an abort occur without damage?
Write is as in a programs write
What damage can an abort cause?
How can you avoid such damage?

4
Transaction (cont.)

User (application developer) must indicate
Begin transaction
read/write/modify statements intermixed with
other programming language statements
plus either
commit - indicates successful completion or
abort - indicates program wants to roll back
(erase the transaction)
All or nothing! (Atomic)

5
Supporting the ACID Properties of Transactions

Atomicity All actions in a transaction happen
in their entirety or not at all.
Consistency If the DB starts in a consistent
state, (this notion is defined by the user
some of it may be enforced by integrity
constraints) and if a transaction executes with
no other queries active, then the DB ends up
in a consistent state.
Isolation Each transaction is isolated from
other transactions. The effect on the DB is
as if each transaction executed by itself.
Durability If a transaction commits, its changes
to the database state persist.

Recovery System
Programmers
Concurrency Control System
Recovery System
6
Isolation/Concurrency Control
T1 BEGIN A100, B-100 END T2 BEGIN
A1.06A, B1.06B END

What is each of these transactions doing?
A schedule of T1 and T2 is an interleaving of the
steps of these transactions so that each
transactions order is preserved.

7
Which of these is a Schedule of T1 and T2?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100, B-100 T2
B1.06B, A1.06A
T1 A100,
B-100 T2 A1.06A,B1.06B
8
Isolated Schedules

A schedule is isolated if its effect on the DB is
as if each transaction executed by itself,
serially.
Which of the schedules on the next page is
isolated?
Hint Calculate the effect of the schedule on a
sample state of the DB, for example A has 1,000,
B has 500. This wont tell you the effect on
all states, but its helpful information.

9
Which Schedules are Isolated?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
T1 A100,B-100 T2
A1.06A, B1.06B

Goal of Concurrency Control subsystem Guarantee
only isolated schedules.

10
Equivalent Schedules

Two schedules are equivalent if given any
starting DB state, they produce the same result.
Which of these schedules is equivalent?

T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
11
Serializable Schedules

A schedule is serializable if it is equivalent
to a serial schedule.
Which of these schedules is serializable?

T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
12
The goal of Concurrency Control

Recall the goal of concurrency control To ensure
that all schedules are isolated
Theorem A schedule is Serializable ? it is
Isolated
? Serializable ? equivalent to some serial
schedule, and in a serial schedule, each Xact. is
isolated
? If each xact runs alone, the schedule must be
serial
But serializability is hard to verify
How can we, in real time, check each schedule?
So the Concurrency Control Subsystem needs more
work.

13
Locking

Transaction must get a lock before it can read
or update data
There are two kinds of locks shared (S) locks
and exclusive (X) locks
To read a record you MUST get an S lockTo modify
or delete a record you MUST get an X lock
Lock info maintained by a lock manager

14
How Locks Work

If a Xact has an S lock on a data object , new
transactions can get S locks on that object, but
not X locks.
If a Xact has an X lock, no other Xact can get
any lock (S or X) on that data object.
If a transaction cant get a lock, it waits (in
a queue).

lock on data item
lock you want
Lock compatibility
15
Strict Two Phase Locking Protocol (S2PL)

Strict 2PL is a way of managing locks during a
transaction
A Xact gets (S and X) locks gradually, as needed
The Xact holds all locks until end of transaction
(commit/abort)

All locks are released at the end, upon commit
or abort
5
of locks held by a transaction T
4
3
2
1
0
time ?
16
Strict 2PL guarantees serializability

Idea of the Proof a Strict 2PL schedule is
equivalent to the serial schedule in which each
transaction runs instantaneously at the time that
it commits
This is huge A property of each transaction
(S2PL) implies a property of any set of
transactions (serializability)
No need to check serializability of any schedules
Real DBMSs use S2PL to enforce serializability
In reality, users can and do choose lower levels
of concurrency for all but the most sensitive
transactions

17
17. Concurrency Control

Conflicts
Conflicting Actions
Conflict Equivalent
Conflict Serializable
Conf. Ser. ? Serializable
Precedence Graph
Conf. Serializable ?Precedence graph is acyclic
Strict 2PL ?Recoverable
2PL ? Recoverable

Locks
Management
Deadlocks
Waits-for
Multiple Granularity
Phantoms
Predicate, Index locking
Locking in B Trees
Optimistic CC
Inefficiency of locking
Optimistic CC idea

18
17.1 Conflict Serializable Schedules
17. CC

Conflicting actions Actions that access the same
data and at least one of which is a write
Note that changing the order of these two actions
might yield different results.
Two schedules are conflict equivalent if
They involve the same actions of the same
transactions in the same order
Every pair of conflicting actions is ordered the
same way
Schedule S is conflict serializable if S is
conflict equivalent to some serial schedule

19
Which are conflict serializable?
17. CC
T1 R(A),W(A), R(B),W(B)
T2 R(A),W(A), R(B),W(B)
T1 R(A), W(A) T2 R(A),
W(A), R(B)
T1 R(B), W(A), W(B) T2
R(A), W(A), R(B)
T1 R(A), W(A) T2 W(A) T3
W(A)
20
Conflict Serializable ? Serializable

If two actions do not conflict, then commuting
them results in an equivalent schedule.
Suppose S is conflict serializable. Then there
is a sequence of commuting actions I I1,,In
so that
Each of the Ii commutes nonconflicting actions
I applied to S is a serial schedule
Because of (a), I does not change the state of
any database. Thus S, and I applied to S, are
equivalent and I applied to S is serial (b), so S
is serializable.

21
Serializable does NOT imply Conflict Serializable
T1 R(A), W(A) T2 W(A) T3
W(A)

Equivalent to what serial schedule?
Therefore it is a serializable schedule
Why is it not conflict serializable? (for now
just give an intuitive reason, later we will have
a proof)

22
Precedence graphs
17. CC

Why is this graph not conflict serializable?
The cycle in the graph illustrates the problem.
T1 must precede T2, and T2 must precede T1, in
any conflict equivalent serial schedule.

T1 R(A), W(A), R(B), W(B) T2
R(A), W(A), R(B), W(B)
23
Precedence Graph
17. CC

Precedence graph One node per Xact edge from
Ti to Tj if an action of Ti precedes and
conflicts with an action of Tj.
Theorem Schedule is conflict serializable if and
only if its precedence graph is acyclic
?If there is a cycle in the graph, it cannot be
serializable (see previous page generalize)
? If the graph is acyclic, the schedule is
equivalent to a topologically sorted order of the
actions.

24
Example of acyclic graph

Is this graph acyclic?
What is a topological sort of it?
Is a schedule, for which this is a precedence
graph, equivalent to a serial schedule?
Can we move all actions of T4 to occur before T2,
without reversing conflicting actions?
How about T1 before T4?

25
Summary
Isolated Xact same results as if it ran alone
Each Xact in a schedule is Isolated
?
??
Serializable Schedule Same result as a serial
schedule
?
The schedule is Serializable
?
The schedule is Conflict Serializable
Conflict Serializable Conflict Equivalent to a
Serializable Schedule
?
??
The Schedules Precedence Graph is Acyclic
?
The schedule is consistent with Strict 2PL.
Strict 2PL There is a locking schedule where all
locks are held until EOT
?
?
?
Deadlock There is a cycle in the Waitsfor graph.
Deadlock is possible
26
Strict 2PL?Recoverable

A schedule is recoverable if, during it, all
transactions commit only after all transactions
whose data they have read commit.
Why is recoverability desirable? Otherwise, T1
may read the data of T2 ( a so-called dirty
read), then T1 commit, then T2 abort and roll
back. Then T1 has read a value that does not
exist.

27
Two-Phase Locking (2PL)
17. CC

Two-Phase Locking Protocol
Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on
object before writing.
A transaction can not request additional locks
once it releases any locks.
2PL implies that all schedules have acyclic
precedence graphs, so are serializable.
However, they are not recoverable, so Strict 2PL
is used in practice.

28
Lock Management
17. CC

Lock and unlock requests are handled by the lock
manager
Lock table entry
IDs of transactions currently holding a lock
Type of lock held (shared or exclusive)
Pointer to queue of lock requests
If there is an S lock on an object O and T1
requests an X lock, what happens? What if then
T2 requests an S lock?
Locking and unlocking have to be atomic
operations
How is this enforced?
Lock upgrade transaction that holds a shared
lock can be upgraded to hold an exclusive lock if
no one else has a shared lock.

29
Managing a new lock (simplified)
New Lock
Type of lock?
S
X
Y
Queue empty?
? lock?
EnQ
N
N
Y
Grant X lock
EnQ
? lock?
N
Y
Grant S lock
Type of lock?
S
X
EnQ
30
Managing a lock release (simplified)
Release Lock
Y
Exit
? Other locks?
N
DeQ xact from Q, give it a lock
Y
Is there an S lock on top of the Q?
S
What type of lock was it?
N
X
Exit
31
Deadlocks
17. CC

Deadlock Cycle of transactions waiting for locks
to be released by each other.
Two ways of dealing with deadlocks
Deadlock prevention
Deadlock detection

32
Deadlock Prevention
17. CC

Theory
Assign priorities based on timestamps.
Older transactions get higher priority.
Assume Ti wants a lock that Tj holds. Two
policies are possible
Wait-Die It Ti has higher priority, Ti waits for
Tj otherwise Ti aborts
Wound-wait If Ti has higher priority, Tj aborts
otherwise Ti waits
If a transaction re-starts, make sure it has its
original timestamp
Practice http//dev.mysql.com/doc/refman/5.0/en/i
nnodb-deadlocks.html

33
Deadlock Detection
17. CC

Create a waits-for graph
Nodes are transactions
There is an edge from Ti to Tj if Ti is waiting
for Tj to release a lock
Periodically check for cycles in the waits-for
graph
Note that waits-for graph is opposite direction
of precedence graph.

34
Deadlock Detection (Continued)
17. CC

Example
T1 S(A), R(A), S(B)
T2 X(B),W(B) X(C)
T3 S(C), R(C) X(A)
T4 X(B)

T1
T2
T4
T3
35
Multiple-Granularity Locks
17. CC

Hard to decide what granularity to lock (tuples
vs. pages vs. tables).
Shouldnt have to decide!
Data containers are nested

contains
36
Solution New Lock Modes, Protocol
17. CC

Allow Xacts to lock at each level, but with a
special protocol using new intention locks

Before locking an item, Xact must set intention
locks on all its ancestors.
IX(IS) Intend to X(S) lock a subset.
SIX S IX at the same time. Used to scan and
update selected records.

37
Multiple Granularity Lock Protocol
17. CC

Each Xact starts from the root of the hierarchy.
To get S or IS lock on a node, must hold IS or IX
on parent node.
To get X or IX or SIX on a node, must hold IX or
SIX on parent node.
Must release locks in bottom-up order.
Sometimes hard to decide granularity of locks.
Can start small and use lock escalation.

38
Examples
17. CC

T1 scans R, and updates a few tuples
T1 gets an SIX lock on R, then repeatedly gets an
S lock on tuples of R, and occasionally upgrades
to X on the tuples.
T2 uses an index to read only part of R
T2 gets an IS lock on R, and repeatedly
gets an S lock on tuples of R.
T3 reads all of R
T3 gets an S lock on R.
OR, T3 could behave like T2 can
use lock escalation to decide
which.

39
Dynamic Databases Phantoms
17. CC
Name Rank Age
Pehr 1 25
John 2 26
Lorr 2 23

If we allow updates, even Strict 2PL will not
assure serializability
T1 finds oldest sailor in each rank
T2 inserts(Rohi,1,27) and deletes John
Schedule is
This schedule is Strict 2PL, but not
serializable!
Result of this schedule is (1,Pehr)(2,Lorr)
Result of T1T2 is (1,Pehr)(2,John)
Result of T2T1 is (1,Rohi)(2,Lorr)

T1 rank 1
T1 rank 2
T2 inserts Rohi, deletes John
40
The Problem
17. CC

When T1 retrieved the oldest sailor of rank 1, it
locked each sailor of rank 1 with a read lock.
None of these locks applied to the new record (a
phantom) inserted by T2.
We need a mechanism to prevent phantoms to allow
T1 to lock present and future sailors with rank
1.
There are two such mechanisms, index locking and
predicate locking.

41
Index Locking
Data
Index
r1

If there is a dense index on the rating field
using Alternative (2), T1 should lock the index
page(s) containing the data entries with rating
1.
If there are no records with rating 1, T1 must
lock the index page where such a data entry would
be, if it existed!
If there is no suitable index, T1 must lock all
pages, and lock the file/table to prevent new
pages from being added, to ensure that no new
records with rating 1 are added.

42
Predicate Locking
17. CC

Grant lock on all records that satisfy some
logical predicate, e.g. age gt 2salary.
Index locking is a special case of predicate
locking for which an index supports efficient
implementation of the predicate lock.
What is the predicate in the sailor example?
In general, predicate locking has a lot of
locking overhead.

43
Locking in B Trees
17. CC

How can we efficiently lock a B tree?
Btw, dont confuse this with multiple granularity
locking!
One solution Ignore the tree structure, just
lock pages while traversing the tree, following
2PL.
This has terrible performance!
Root node (and many higher level nodes) become
bottlenecks because every tree access begins at
the root. This single threads all updates to the
tree.

44
Two Useful Observations
17. CC

Higher levels of the tree only direct searches
for leaf pages.
For inserts/deletes, a node on a path from root
to modified leaf must be locked (in X mode, of
course), only if a split can propagate up to it
from the modified leaf.
We can exploit these observations to design
efficient locking protocols that guarantee
serializability even though they violate 2PL.

45
A Tree Locking Algorithm
17. CC

Search Start at root and go down repeatedly, S
lock child then unlock parent.
Insert/Delete Start at root and go down,
obtaining X locks as needed. Once child is
locked, check if it is safe
If child is safe, release all locks on ancestors.
Safe node Node such that the change will not
propagate up beyond this node.
Inserts Node is not full.
Deletes Node is not half-empty.

46
Example
ROOT
Do 1) Search 38 2) Delete 38 3) Insert
45 4) Insert 25
A
20
35
B
35
23
38 44
35
35
C
F
38
44
23
H
D
E
G
I
20
22
23
24
35
36
38
41
44
47
Optimistic CC (Kung-Robinson)
17. CC

Locking is a conservative approach in which
conflicts are prevented. Disadvantages
Lock management overhead.
Deadlock detection/resolution.
Lock contention for heavily used objects.
If conflicts are rare, we might be able to gain
concurrency by not locking, and instead checking
for conflicts before Xacts commit.
A version of this optimistic approach is used by
PostgreSQL and Oracle

48
Kung-Robinson Model
17. CC

Xacts have three phases
READ Xacts read from the database, but make
changes to private copies of objects.
VALIDATE Check for conflicts.
WRITE Make local copies of changes public.

old
ROOT
modified objects
new
49
Validation
17. CC

Each Xact is assigned a numeric id.
Just use a timestamp.
Xact ids assigned at end of READ phase, just
before validation begins.
ReadSet(Ti) Set of objects read by Xact Ti.
WriteSet(Ti) Set of objects modified by Ti.

50
Test 1
17. CC

For all i and j such that TSi lt TSj, check that
Ti completes before Tj begins.

Ti
Tj
R
V
W
R
V
W
51
Test 2
17. CC

For all i and j such that Ti lt Tj, check that
Ti completes before Tj begins its Write phase
WriteSet(Ti) ReadSet(Tj) is empty.

Ti
R
V
W
Tj
R
V
W
Does Tj read dirty data? Does Ti overwrite Tjs
writes?
52
Test 3
17. CC

For all i and j such that Ti lt Tj, check that
Ti completes Read phase before Tj does
WriteSet(Ti) ReadSet(Tj) is empty
WriteSet(Ti) WriteSet(Tj) is empty.

Tj
R
V
W
Does Tj read dirty data? Does Ti overwrite Tjs
writes?
53
Overheads in Optimistic CC
17. CC

Must record read/write activity in ReadSet and
WriteSet per Xact.
Must create and destroy these sets as needed.
Must check for conflicts during validation, and
must make validated writes global.
Critical section can reduce concurrency.
Scheme for making writes global can reduce
clustering of objects.
Optimistic CC restarts Xacts that fail
validation.
Work done so far is wasted requires clean-up.