16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates - PowerPoint PPT Presentation

About This Presentation
Title:

16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates

Description:

Is a schedule, for which this is a precedence graph, equivalent to a serial schedule? ... that all schedules have acyclic precedence graphs, so are serializable. ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 54
Provided by: RaghuRamak
Learn more at: http://web.cecs.pdx.edu
Category:

less

Transcript and Presenter's Notes

Title: 16.%20Concurrency%20Control%20and%20Recovery%20(only%20for%20DBs%20with%20updates


1
16. Concurrency Control and Recovery(only for
DBs with updates..!)- Review
  • Concurrency Control
  • Transaction
  • ACID
  • Isolation
  • Schedules
  • Guaranteeing isolation
  • Serializability
  • Serializability ? Isolation
  • Locking
  • Strict Two Phase Locking
  • Strict 2PL ?Serializable

2
Learning Objectives
  • Define ACID, schedule, isolated, equivalent,
    serializable, S2PL, conflict serializable,
    precedence graph, recoverable.
  • Know the implications on slide 25 and when the
    converses hold
  • Explain lock management, multiple granularity
    locks, phantoms, locking in BTrees, optimistic
    concurrency control

3
Example Transaction
  • Transfer 100 from A to B
  • Read A Verify A Write A-100 then
  • Read B Verify B Write B100
  • Are all 6 steps necessary?
  • Which steps require disc access?
  • When can an abort occur without damage?
  • Write is as in a programs write
  • What damage can an abort cause?
  • How can you avoid such damage?

4
Transaction (cont.)
  • User (application developer) must indicate
  • Begin transaction
  • read/write/modify statements intermixed with
    other programming language statements
  • plus either
  • commit - indicates successful completion or
  • abort - indicates program wants to roll back
    (erase the transaction)
  • All or nothing! (Atomic)

5
Supporting the ACID Properties of Transactions
  • Atomicity All actions in a transaction happen
    in their entirety or not at all.
  • Consistency If the DB starts in a consistent
    state, (this notion is defined by the user
    some of it may be enforced by integrity
    constraints) and if a transaction executes with
    no other queries active, then the DB ends up
    in a consistent state.
  • Isolation Each transaction is isolated from
    other transactions. The effect on the DB is
    as if each transaction executed by itself.
  • Durability If a transaction commits, its changes
    to the database state persist.

Recovery System
Programmers
Concurrency Control System
Recovery System
6
Isolation/Concurrency Control
T1 BEGIN A100, B-100 END T2 BEGIN
A1.06A, B1.06B END
  • What is each of these transactions doing?
  • A schedule of T1 and T2 is an interleaving of the
    steps of these transactions so that each
    transactions order is preserved.

7
Which of these is a Schedule of T1 and T2?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100, B-100 T2
B1.06B, A1.06A
T1 A100,
B-100 T2 A1.06A,B1.06B
8
Isolated Schedules
  • A schedule is isolated if its effect on the DB is
    as if each transaction executed by itself,
    serially.
  • Which of the schedules on the next page is
    isolated?
  • Hint Calculate the effect of the schedule on a
    sample state of the DB, for example A has 1,000,
    B has 500. This wont tell you the effect on
    all states, but its helpful information.

9
Which Schedules are Isolated?
T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
T1 A100,B-100 T2
A1.06A, B1.06B
  • Goal of Concurrency Control subsystem Guarantee
    only isolated schedules.

10
Equivalent Schedules
  • Two schedules are equivalent if given any
    starting DB state, they produce the same result.
  • Which of these schedules is equivalent?

T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
11
Serializable Schedules
  • A schedule is serializable if it is equivalent
    to a serial schedule.
  • Which of these schedules is serializable?

T1 A100, B-100 T2 A1.06A,
B1.06B
T1 A100,
B-100 T2 A1.06A, B1.06B
T1 A100, B-100 T2
A1.06A, B1.06B
T1 A100,
B-100 T2 A1.06A,B1.06B
12
The goal of Concurrency Control
  • Recall the goal of concurrency control To ensure
    that all schedules are isolated
  • Theorem A schedule is Serializable ? it is
    Isolated
  • ? Serializable ? equivalent to some serial
    schedule, and in a serial schedule, each Xact. is
    isolated
  • ? If each xact runs alone, the schedule must be
    serial
  • But serializability is hard to verify
  • How can we, in real time, check each schedule?
  • So the Concurrency Control Subsystem needs more
    work.

13
Locking
  • Transaction must get a lock before it can read
    or update data
  • There are two kinds of locks shared (S) locks
    and exclusive (X) locks
  • To read a record you MUST get an S lockTo modify
    or delete a record you MUST get an X lock
  • Lock info maintained by a lock manager

14
How Locks Work
  • If a Xact has an S lock on a data object , new
    transactions can get S locks on that object, but
    not X locks.
  • If a Xact has an X lock, no other Xact can get
    any lock (S or X) on that data object.
  • If a transaction cant get a lock, it waits (in
    a queue).

lock on data item
lock you want
Lock compatibility
15
Strict Two Phase Locking Protocol (S2PL)
  • Strict 2PL is a way of managing locks during a
    transaction
  • A Xact gets (S and X) locks gradually, as needed
  • The Xact holds all locks until end of transaction
    (commit/abort)

All locks are released at the end, upon commit
or abort
5
of locks held by a transaction T
4
3
2
1
0
time ?
16
Strict 2PL guarantees serializability
  • Idea of the Proof a Strict 2PL schedule is
    equivalent to the serial schedule in which each
    transaction runs instantaneously at the time that
    it commits
  • This is huge A property of each transaction
    (S2PL) implies a property of any set of
    transactions (serializability)
  • No need to check serializability of any schedules
  • Real DBMSs use S2PL to enforce serializability
  • In reality, users can and do choose lower levels
    of concurrency for all but the most sensitive
    transactions

17
17. Concurrency Control
  • Conflicts
  • Conflicting Actions
  • Conflict Equivalent
  • Conflict Serializable
  • Conf. Ser. ? Serializable
  • Precedence Graph
  • Conf. Serializable ?Precedence graph is acyclic
  • Strict 2PL ?Recoverable
  • 2PL ? Recoverable
  • Locks
  • Management
  • Deadlocks
  • Waits-for
  • Multiple Granularity
  • Phantoms
  • Predicate, Index locking
  • Locking in B Trees
  • Optimistic CC
  • Inefficiency of locking
  • Optimistic CC idea

18
17.1 Conflict Serializable Schedules
17. CC
  • Conflicting actions Actions that access the same
    data and at least one of which is a write
  • Note that changing the order of these two actions
    might yield different results.
  • Two schedules are conflict equivalent if
  • They involve the same actions of the same
    transactions in the same order
  • Every pair of conflicting actions is ordered the
    same way
  • Schedule S is conflict serializable if S is
    conflict equivalent to some serial schedule

19
Which are conflict serializable?
17. CC
T1 R(A),W(A), R(B),W(B)
T2 R(A),W(A), R(B),W(B)
T1 R(A), W(A) T2 R(A),
W(A), R(B)
T1 R(B), W(A), W(B) T2
R(A), W(A), R(B)
T1 R(A), W(A) T2 W(A) T3
W(A)
20
Conflict Serializable ? Serializable
  • If two actions do not conflict, then commuting
    them results in an equivalent schedule.
  • Suppose S is conflict serializable. Then there
    is a sequence of commuting actions I I1,,In
    so that
  • Each of the Ii commutes nonconflicting actions
  • I applied to S is a serial schedule
  • Because of (a), I does not change the state of
    any database. Thus S, and I applied to S, are
    equivalent and I applied to S is serial (b), so S
    is serializable.

21
Serializable does NOT imply Conflict Serializable
T1 R(A), W(A) T2 W(A) T3
W(A)
  • Equivalent to what serial schedule?
  • Therefore it is a serializable schedule
  • Why is it not conflict serializable? (for now
    just give an intuitive reason, later we will have
    a proof)

22
Precedence graphs
17. CC
  • Why is this graph not conflict serializable?
  • The cycle in the graph illustrates the problem.
    T1 must precede T2, and T2 must precede T1, in
    any conflict equivalent serial schedule.

T1 R(A), W(A), R(B), W(B) T2
R(A), W(A), R(B), W(B)
23
Precedence Graph
17. CC
  • Precedence graph One node per Xact edge from
    Ti to Tj if an action of Ti precedes and
    conflicts with an action of Tj.
  • Theorem Schedule is conflict serializable if and
    only if its precedence graph is acyclic
  • ?If there is a cycle in the graph, it cannot be
    serializable (see previous page generalize)
  • ? If the graph is acyclic, the schedule is
    equivalent to a topologically sorted order of the
    actions.

24
Example of acyclic graph
  • Is this graph acyclic?
  • What is a topological sort of it?
  • Is a schedule, for which this is a precedence
    graph, equivalent to a serial schedule?
  • Can we move all actions of T4 to occur before T2,
    without reversing conflicting actions?
  • How about T1 before T4?

25
Summary
Isolated Xact same results as if it ran alone
Each Xact in a schedule is Isolated
?
??
Serializable Schedule Same result as a serial
schedule
?
The schedule is Serializable
?
The schedule is Conflict Serializable
Conflict Serializable Conflict Equivalent to a
Serializable Schedule
?
??
The Schedules Precedence Graph is Acyclic
?
The schedule is consistent with Strict 2PL.
Strict 2PL There is a locking schedule where all
locks are held until EOT
?
?
?
Deadlock There is a cycle in the Waitsfor graph.
Deadlock is possible
26
Strict 2PL?Recoverable
  • A schedule is recoverable if, during it, all
    transactions commit only after all transactions
    whose data they have read commit.
  • Why is recoverability desirable? Otherwise, T1
    may read the data of T2 ( a so-called dirty
    read), then T1 commit, then T2 abort and roll
    back. Then T1 has read a value that does not
    exist.

27
Two-Phase Locking (2PL)
17. CC
  • Two-Phase Locking Protocol
  • Each Xact must obtain a S (shared) lock on object
    before reading, and an X (exclusive) lock on
    object before writing.
  • A transaction can not request additional locks
    once it releases any locks.
  • 2PL implies that all schedules have acyclic
    precedence graphs, so are serializable.
  • However, they are not recoverable, so Strict 2PL
    is used in practice.

28
Lock Management
17. CC
  • Lock and unlock requests are handled by the lock
    manager
  • Lock table entry
  • IDs of transactions currently holding a lock
  • Type of lock held (shared or exclusive)
  • Pointer to queue of lock requests
  • If there is an S lock on an object O and T1
    requests an X lock, what happens? What if then
    T2 requests an S lock?
  • Locking and unlocking have to be atomic
    operations
  • How is this enforced?
  • Lock upgrade transaction that holds a shared
    lock can be upgraded to hold an exclusive lock if
    no one else has a shared lock.

29
Managing a new lock (simplified)
New Lock
Type of lock?
S
X
Y
Queue empty?
? lock?
EnQ
N
N
Y
Grant X lock
EnQ
? lock?
N
Y
Grant S lock
Type of lock?
S
X
EnQ
30
Managing a lock release (simplified)
Release Lock
Y
Exit
? Other locks?
N
DeQ xact from Q, give it a lock
Y
Is there an S lock on top of the Q?
S
What type of lock was it?
N
X
Exit
31
Deadlocks
17. CC
  • Deadlock Cycle of transactions waiting for locks
    to be released by each other.
  • Two ways of dealing with deadlocks
  • Deadlock prevention
  • Deadlock detection

32
Deadlock Prevention
17. CC
  • Theory
  • Assign priorities based on timestamps.
  • Older transactions get higher priority.
  • Assume Ti wants a lock that Tj holds. Two
    policies are possible
  • Wait-Die It Ti has higher priority, Ti waits for
    Tj otherwise Ti aborts
  • Wound-wait If Ti has higher priority, Tj aborts
    otherwise Ti waits
  • If a transaction re-starts, make sure it has its
    original timestamp
  • Practice http//dev.mysql.com/doc/refman/5.0/en/i
    nnodb-deadlocks.html

33
Deadlock Detection
17. CC
  • Create a waits-for graph
  • Nodes are transactions
  • There is an edge from Ti to Tj if Ti is waiting
    for Tj to release a lock
  • Periodically check for cycles in the waits-for
    graph
  • Note that waits-for graph is opposite direction
    of precedence graph.

34
Deadlock Detection (Continued)
17. CC
  • Example
  • T1 S(A), R(A), S(B)
  • T2 X(B),W(B) X(C)
  • T3 S(C), R(C) X(A)
  • T4 X(B)

T1
T2
T4
T3
35
Multiple-Granularity Locks
17. CC
  • Hard to decide what granularity to lock (tuples
    vs. pages vs. tables).
  • Shouldnt have to decide!
  • Data containers are nested

contains
36
Solution New Lock Modes, Protocol
17. CC
  • Allow Xacts to lock at each level, but with a
    special protocol using new intention locks
  • Before locking an item, Xact must set intention
    locks on all its ancestors.
  • IX(IS) Intend to X(S) lock a subset.
  • SIX S IX at the same time. Used to scan and
    update selected records.

37
Multiple Granularity Lock Protocol
17. CC
  • Each Xact starts from the root of the hierarchy.
  • To get S or IS lock on a node, must hold IS or IX
    on parent node.
  • To get X or IX or SIX on a node, must hold IX or
    SIX on parent node.
  • Must release locks in bottom-up order.
  • Sometimes hard to decide granularity of locks.
    Can start small and use lock escalation.

38
Examples
17. CC
  • T1 scans R, and updates a few tuples
  • T1 gets an SIX lock on R, then repeatedly gets an
    S lock on tuples of R, and occasionally upgrades
    to X on the tuples.
  • T2 uses an index to read only part of R
  • T2 gets an IS lock on R, and repeatedly
    gets an S lock on tuples of R.
  • T3 reads all of R
  • T3 gets an S lock on R.
  • OR, T3 could behave like T2 can
    use lock escalation to decide
    which.

39
Dynamic Databases Phantoms
17. CC
Name Rank Age
Pehr 1 25
John 2 26
Lorr 2 23
  • If we allow updates, even Strict 2PL will not
    assure serializability
  • T1 finds oldest sailor in each rank
  • T2 inserts(Rohi,1,27) and deletes John
  • Schedule is
  • This schedule is Strict 2PL, but not
    serializable!
  • Result of this schedule is (1,Pehr)(2,Lorr)
  • Result of T1T2 is (1,Pehr)(2,John)
  • Result of T2T1 is (1,Rohi)(2,Lorr)

T1 rank 1
T1 rank 2
T2 inserts Rohi, deletes John
40
The Problem
17. CC
  • When T1 retrieved the oldest sailor of rank 1, it
    locked each sailor of rank 1 with a read lock.
  • None of these locks applied to the new record (a
    phantom) inserted by T2.
  • We need a mechanism to prevent phantoms to allow
    T1 to lock present and future sailors with rank
    1.
  • There are two such mechanisms, index locking and
    predicate locking.

41
Index Locking
Data
Index
r1
  • If there is a dense index on the rating field
    using Alternative (2), T1 should lock the index
    page(s) containing the data entries with rating
    1.
  • If there are no records with rating 1, T1 must
    lock the index page where such a data entry would
    be, if it existed!
  • If there is no suitable index, T1 must lock all
    pages, and lock the file/table to prevent new
    pages from being added, to ensure that no new
    records with rating 1 are added.

42
Predicate Locking
17. CC
  • Grant lock on all records that satisfy some
    logical predicate, e.g. age gt 2salary.
  • Index locking is a special case of predicate
    locking for which an index supports efficient
    implementation of the predicate lock.
  • What is the predicate in the sailor example?
  • In general, predicate locking has a lot of
    locking overhead.

43
Locking in B Trees
17. CC
  • How can we efficiently lock a B tree?
  • Btw, dont confuse this with multiple granularity
    locking!
  • One solution Ignore the tree structure, just
    lock pages while traversing the tree, following
    2PL.
  • This has terrible performance!
  • Root node (and many higher level nodes) become
    bottlenecks because every tree access begins at
    the root. This single threads all updates to the
    tree.

44
Two Useful Observations
17. CC
  • Higher levels of the tree only direct searches
    for leaf pages.
  • For inserts/deletes, a node on a path from root
    to modified leaf must be locked (in X mode, of
    course), only if a split can propagate up to it
    from the modified leaf.
  • We can exploit these observations to design
    efficient locking protocols that guarantee
    serializability even though they violate 2PL.

45
A Tree Locking Algorithm
17. CC
  • Search Start at root and go down repeatedly, S
    lock child then unlock parent.
  • Insert/Delete Start at root and go down,
    obtaining X locks as needed. Once child is
    locked, check if it is safe
  • If child is safe, release all locks on ancestors.
  • Safe node Node such that the change will not
    propagate up beyond this node.
  • Inserts Node is not full.
  • Deletes Node is not half-empty.

46
Example
ROOT
Do 1) Search 38 2) Delete 38 3) Insert
45 4) Insert 25
A
20
35
B
35
23
38 44
35
35
C
F
38
44
23
H
D
E
G
I
20
22
23
24
35
36
38
41
44
47
Optimistic CC (Kung-Robinson)
17. CC
  • Locking is a conservative approach in which
    conflicts are prevented. Disadvantages
  • Lock management overhead.
  • Deadlock detection/resolution.
  • Lock contention for heavily used objects.
  • If conflicts are rare, we might be able to gain
    concurrency by not locking, and instead checking
    for conflicts before Xacts commit.
  • A version of this optimistic approach is used by
    PostgreSQL and Oracle

48
Kung-Robinson Model
17. CC
  • Xacts have three phases
  • READ Xacts read from the database, but make
    changes to private copies of objects.
  • VALIDATE Check for conflicts.
  • WRITE Make local copies of changes public.

old
ROOT
modified objects
new
49
Validation
17. CC
  • Each Xact is assigned a numeric id.
  • Just use a timestamp.
  • Xact ids assigned at end of READ phase, just
    before validation begins.
  • ReadSet(Ti) Set of objects read by Xact Ti.
  • WriteSet(Ti) Set of objects modified by Ti.

50
Test 1
17. CC
  • For all i and j such that TSi lt TSj, check that
    Ti completes before Tj begins.

Ti
Tj
R
V
W
R
V
W
51
Test 2
17. CC
  • For all i and j such that Ti lt Tj, check that
  • Ti completes before Tj begins its Write phase
  • WriteSet(Ti) ReadSet(Tj) is empty.

Ti
R
V
W
Tj
R
V
W
Does Tj read dirty data? Does Ti overwrite Tjs
writes?
52
Test 3
17. CC
  • For all i and j such that Ti lt Tj, check that
  • Ti completes Read phase before Tj does
  • WriteSet(Ti) ReadSet(Tj) is empty
  • WriteSet(Ti) WriteSet(Tj) is empty.

Tj
R
V
W
Does Tj read dirty data? Does Ti overwrite Tjs
writes?
53
Overheads in Optimistic CC
17. CC
  • Must record read/write activity in ReadSet and
    WriteSet per Xact.
  • Must create and destroy these sets as needed.
  • Must check for conflicts during validation, and
    must make validated writes global.
  • Critical section can reduce concurrency.
  • Scheme for making writes global can reduce
    clustering of objects.
  • Optimistic CC restarts Xacts that fail
    validation.
  • Work done so far is wasted requires clean-up.
Write a Comment
User Comments (0)
About PowerShow.com