Transaction Processing Concepts - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

Transaction Processing Concepts

Description:

Construct a directed graph, precedence graph, G = (V, E) ... consistent with precedence order of graph) ... (precedence order) no cycles S is serializable. 42 ... – PowerPoint PPT presentation

Number of Views:211
Avg rating:3.0/5.0
Slides: 77
Provided by: joac72
Category:

less

Transcript and Presenter's Notes

Title: Transaction Processing Concepts


1
Transaction Processing Concepts
Lecture Notes
  • Chapter 17 18 from Elmasri and Navathes book

2
Outline of Lecture
  • Introduction to Transaction Processing
  • Problems and Solution
  • Concepts
  • Properties

3
Why Do We Need Transactions?
  • Its all about fast query response time and
    correctness
  • DBMS is a multi-user systems
  • Many different requests
  • Some against same data items
  • Figure out how to interleave requests to shorten
    response time while guaranteeing correct result
  • How does DBMS know which actions belong together?
  • Solution Group database operations that must be
    performed together into transactions
  • Either execute all operations or none

4
Concurrent Transactions
B
B
CPU2
A
A
CPU1
CPU1
time
t1
t2
t1
t2
interleaved processing
parallel processing
5
Terminology
  • A transaction T is a logical unit of database
    processing that includes one or more database
    access operations
  • Embedded within application program
  • Specified interactively (e.g., via SQL)
  • Transaction boundaries Begin/end transaction
  • Read vs. write transaction

6
Sample Transaction (informal)
  • Example Move 40 from checking to savings
    account
  • To user, appears as one activity
  • To database
  • Read balance of checking account read( X)
  • Read balance of savings account read (Y)
  • Subtract 40 from X
  • Add 40 to Y
  • Write new value of X back to disk
  • Write new value of Y back to disk

7
Another Sample Transaction
  • Reserving a seat for a flight
  • If concurrent access to data in DBMS, two users
    may try to book the same seat simultaneously

Agent 1 finds seat 35G empty
time
Agent 2 finds seat 35G empty
Agent 1 sets seat 35G occupied
Agent 2 sets seat 35G occupied
8
Terminology
  • Basic access operations
  • read_item(X)
  • write_item(X)
  • How are read_item or write_item implemented?
  • Read-set of T all items that transaction reads
  • Write-set of T all items that transaction writes

9
Sample Transaction (Formal)
  • T1
  • read_item(X)
  • read_item(Y)
  • XX-40
  • YY40
  • write _item(X)
  • write_item(Y)

t0
tk
10
What Can Go Wrong?
  • Several problems can occur when concurrent
    transactions execute without control
  • Illustrate several scenarios on the next slides

11
Lost Update Problem
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • YYN
  • write_item(Y)

T2 read_item(X) XXM write_item(X)
time
12
Temporary Update (Dirty Read)
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • T1 fails and aborts

T2 read_item(X) XXM write_item(X)
time
13
Incorrect Summary Problem
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • YYN
  • Write_item(Y)

T2 sum0 read_item(A) sumsumA read_item(X)
sumsumX read_item(Y) sumsumY
time
14
What Can Go Wrong?
  • System may crash before data is written back to
    disk
  • Problem of atomicity
  • Some other transaction is modifying shared data
    while our transaction is ongoing (or vice versa)
  • Problem of serialization and isolation
  • System may not be able to obtain one or more of
    the data items
  • System may not be able to write one or more of
    the data items
  • Also problems of atomicity
  • DBMS has a Concurrency Control subsytem to assure
    database remains in consistent state despite
    concurrent execution of transactions

15
Other Problems
  • System failures may occur
  • Types of failures
  • System crash
  • Transaction or system error
  • Local errors
  • Concurrency control enforcement
  • Disk failure
  • Physical failures
  • DBMS has a Recovery Subsystem to protect database
    against system failures

16
Transaction States
  • BEGIN_TRANSACTION marks start of transaction
  • READ or WRITE two possible operations on the
    data
  • END_TRANSACTION marks the end of the read or
    write operations start checking whether
    everything went according to plan
  • COMIT_TRANSACTION signals successful end of
    transaction changes can be committed to DB
  • ROLLBACK (or ABORT) signals unsuccessful end of
    transaction, changes applied to DB must be undone

17
State Diagram
BEGIN TRANSACTION
Active
END TRANSACTION
COMMIT
Partially Committed
Committed
READ, WRITE
ABORT
ABORT
Terminated
Failed
18
System Log
  • Remember, DBMS must assure that we dont loose
    information due to system crashes
  • i.e., how do we recover from failure?
  • Keep system log
  • Kept on disk, backed up periodically
  • Record every action
  • start_transaction, T
  • write_item, T, X, old, new
  • read_item, T, X
  • commit, T
  • abort, T

19
How is the Log File Used?
  • All permanent changes to data are recorded
  • Possible to undo changes to data
  • After crash, search log backwards until find last
    commit point
  • Know that beyond this point, effects of
    transaction are permanently recorded
  • Need to either redo or undo everything that
    happened since last commit point
  • Undo When transaction only partially completed
    (before crash)
  • Redo Transaction completed but we are unsure
    whether data was written to disk

20
ACID Properties of Transactions
  • Atomicity Transaction is either performed in its
    entirety or not performed at all
  • Task of the recovery subsystem to enforce
    atomicity
  • Consistency preservation Transaction must take
    the database from one consistent state to another
  • Users/DBMS enforce implicit and explicit
    constraints
  • Isolation Transaction should appear as though it
    is being executed in isolation from other
    transaction
  • Enforced by concurrency control subsystem
  • Durability Changes applied to the database by a
    committed transaction must persist
  • Enforced by recovery subsystem

21
Transaction Schedules
Lecture Notes
22
Outline of Lecture
  • Transaction Schedules
  • Recoverable Schedules
  • Serializability
  • Definition
  • Test

23
Transaction Schedule
  • Recall Multiple transactions can be executed
    concurrently by interleaving their operations
  • Ordering of execution of operations from various
    transactions T1, T2, , Tn is called a schedule
    S
  • Constraint For each transaction Ti, the order in
    which operations occur in S must the same as in
    Ti
  • Only interested in read (r), write (w), commit
    (c), abort (a)

24
Sample Schedule
  • T1 r(X) w(X) r(Y) w(Y) c
  • T2 r(X) w(X) c
  • Sample schedule
  • S r1(X) r2(X) w1(X) r1(Y) w2(X) w1(Y) c1
    c2

25
Conflicts
  • Two operations conflict if they satisfy ALL three
    conditions
  • they belong to different transactions AND
  • they access the same item AND
  • at least one is a write_item()operation
  • Ex.
  • S r1(X) r2(X) w1(X) r1(Y) w2(X) w1(Y)

conflicts
26
Why Do We Interleave Transactions?
Schedule S
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • YYN
  • write_item(Y)

T2 read_item(X) XXM write_item(X)
Could be a long wait
S is a serial schedule no interleaving!
27
Serial Schedule
  • If we consider transactions to be independent,
    serial schedule is correct
  • Based on C property in ACID above assumption is
    valid
  • Furthermore, it does not matter which transaction
    is executed first, as long as every transaction
    is executed in its entirety, from beginning to
    end
  • Assume X90, Y90, N3, M2, then result of
    schedule S is X89 and Y 93
  • Same result if we start with T2

28
Better?
Schedule S
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • YYN
  • write_item(Y)

T2 read_item(X) XXM write_item(X)
S is a non-serial schedule T2 will be done
faster but is the result correct?
29
Concurrent Executions
  • Serial execution is by far simplest method to
    execute transactions
  • No extra work ensuring consistency
  • Inefficient!
  • Reasons for concurrency
  • Increased throughput
  • Reduces average response time
  • Need concept of correct concurrent execution
  • Using same X, Y, N, M values as before, result of
    S is X92 and Y93 (not correct)

30
Better?
Schedule S
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • YYN
  • write_item(Y)

T2 read_item(X) XXM write_item(X)
S is a non-serial schedule Produces same result
as serial schedule S
31
Serializability
  • Assumption Every serial schedule is correct
  • Goal Like to find non-serial schedules which are
    also correct
  • A schedule S of n transactions is serializable if
    it is equivalent to some serial schedule of the
    same n transactions
  • When are two schedules equivalent?
  • Option 1 They lead to same result (result
    equivalent)
  • Option 2 The order of any two conflicting
    operations is the same (conflict equivalent)

32
Result Equivalent Schedules
  • Two schedules are result equivalent if they
    produce the same final state of the database
  • Problem May produce same result by accident!

S1 read_item(X) XX10 write_item(X)
S2 read_item(X) XX1.1 write_item(X)
Schedules S1 and S2 are result equivalent for
X100 but not in general
33
Conflict Equivalent Schedules
  • Two schedules are conflict equivalent, if the
    order of any two conflicting operations is the
    same in both schedules

34
Conflict Equivalence
Serial Schedule S1
  • T1
  • read_item(A)
  • write_item(A)
  • read_item(B)
  • write_item(B)

T2 read_item(A) write_item(A) read_item(B)
write_item(B)
order doesnt matter
order matters
order doesnt matter
order matters
35
Conflict Equivalence
Schedule S1
  • T1
  • read_item(A)
  • write_item(A)
  • read_item(B)
  • write_item(B)

T2 read_item(A) write_item(A) read_item(B)
write_item(B)
same order as in S1
same order as in S1
S1 and S1 are conflict equivalent (S1 produces
the same result as S1)
36
Conflict Equivalence
Schedule S1
  • T1
  • read_item(A)
  • write_item(A)
  • read_item(B)
  • write_item(B)

T2 read_item(A) write_item(A) read_item(B) wr
ite_item(B)
different order as in S1
different order as in S1
Schedule S1 is not conflict equivalent to
S1 (produces a different result than S1)
37
Conflict Serializable
  • Schedule S is conflict serializable if it is
    conflict equivalent to some serial schedule S
  • Can reorder the non-conflicting operations to
    improve efficiency
  • Non-conflicting operations
  • Reads and writes from same transaction
  • Reads from different transactions
  • Reads and writes from different transactions on
    different data items
  • Conflicting operations
  • Reads and writes from different transactions on
    same data item

38
Example
Schedule A
Schedule B
  • T1
  • read_item(X)
  • XX-N
  • write_item(X)
  • read_item(Y)
  • YYN
  • write_item(Y)

T2 read_item(X) XXM write_item(X)
T1 read_item(X) XX-N write_item(X) read_it
em(Y) YYN write_item(Y)
T2 read_item(X) XXM write_item(X)
B is conflict equivalent to A ? B is serializable
39
Test for Serializability
  • Construct a directed graph, precedence graph, G
    (V, E)
  • V set of all transactions participating in
    schedule
  • E set of edges Ti ? Tj for which one of the
    following holds
  • Ti executes a write_item(X) before Tj executes
    read_item(X)
  • Ti executes a read_item(X) before Tj executes
    write_item(X)
  • Ti executes a write_item(X) before Tj executes
    write_item(X)
  • An edge Ti ? Tj means that in any serial schedule
    equivalent to S, Ti must come before Tj
  • If G has a cycle, than S is not conflict
    serializable
  • If not, use topological sort to obtain
    serialiazable schedule (linear order consistent
    with precedence order of graph)

40
Sample Schedule S
  • T1
  • read_item(X)
  • write_item(X)
  • read_item(Y)
  • write_item(Y)

T2 read_item(Z) read_item(Y) write_item(
Y) read_item(X) write_item(X)
T3 read_item(Y) read_item(Z) write_item(Y) wr
ite_item(Z)
41
Precedence Graph for S
X,Y
T1
T2
Y,Z
Y
no cycles ? S is serializable
T3
Equivalent Serial Schedule T3 ? T1 ? T2
(precedence order)
42
Characterizing Schedules based on Serializability
  • Being serializable is not the same as being
    serial
  •  
  • Being serializable implies that the schedule is a
    correct schedule.
  • It will leave the database in a consistent state.
  • The interleaving is appropriate and will result
    in a state as if the transactions were serially
    executed, yet will achieve efficiency due to
    concurrent execution.

43
Characterizing Schedules based on Serializability
  • Serializability is hard to check.
  • Interleaving of operations occurs in an operating
    system through some scheduler
  • Difficult to determine beforehand how the
    operations in a schedule will be interleaved.

44
Characterizing Schedules based on Serializability
  • Practical approach
  • Come up with methods (protocols) to ensure
    serializability.
  • Its not possible to determine when a schedule
    begins and when it ends. Hence, we reduce the
    problem of checking the whole schedule to
    checking only a committed project of the schedule
    (i.e. operations from only the committed
    transactions.)
  • Current approach used in most DBMSs
  • Use of locks with two phase locking

45
Characterizing Schedules based on Serializability
  • View equivalence A less restrictive definition
    of equivalence of schedules
  • View serializability definition of
    serializability based on view equivalence. A
    schedule is view serializable if it is view
    equivalent to a serial schedule.

46
Characterizing Schedules based on Serializability
  • Two schedules are said to be view equivalent if
    the following three conditions hold
  • The same set of transactions participates in S
    and S, and S and S include the same operations
    of those transactions.
  • For any operation Ri(X) of Ti in S, if the value
    of X read by the operation has been written by an
    operation Wj(X) of Tj (or if it is the original
    value of X before the schedule started), the same
    condition must hold for the value of X read by
    operation Ri(X) of Ti in S.
  • If the operation Wk(Y) of Tk is the last
    operation to write item Y in S, then Wk(Y) of Tk
    must also be the last operation to write item Y
    in S.

47
Characterizing Schedules based on Serializability
  • The premise behind view equivalence
  • As long as each read operation of a transaction
    reads the result of the same write operation in
    both schedules, the write operations of each
    transaction musr produce the same results.
  • The view the read operations are said to see
    the the same view in both schedules.

48
Characterizing Schedules based on Serializability
  • Relationship between view and conflict
    equivalence
  • The two are same under constrained write
    assumption which assumes that if T writes X, it
    is constrained by the value of X it read i.e.,
    new X f(old X)
  • Conflict serializability is stricter than view
    serializability. With unconstrained write (or
    blind write), a schedule that is view
    serializable is not necessarily conflict
    serialiable.
  • Any conflict serializable schedule is also view
    serializable, but not vice versa.

49
Characterizing Schedules based on Serializability
  • Relationship between view and conflict
    equivalence (cont)
  • Consider the following schedule of three
    transactions
  • T1 r1(X), w1(X) T2 w2(X) and T3 w3(X)
  • Schedule Sa r1(X) w2(X) w1(X) w3(X) c1 c2
    c3
  • In Sa, the operations w2(X) and w3(X) are blind
    writes, since T1 and T3 do not read the value of
    X.
  • Sa is view serializable, since it is view
    equivalent to the serial schedule T1, T2, T3.
    However, Sa is not conflict serializable, since
    it is not conflict equivalent to any serial
    schedule.

50
Concurrency Control Techniques
Lecture Notes
51
Outline of Lecture
  • Concurrency Based on Locking Techniques
  • Locks
  • 2-Phase Locking Protocol

52
Remember ACID Properties?
  • Atomicity ? enforced by transaction recovery
    subsystem
  • Consistency ? enforced by application programmers
    and part of DBMS that enforces integrity
    constraints
  • Isolation ? enforced by concurrency control
    system
  • Durability ? enforced by transaction recovery
    subsystem
  • Now, focus on concurrency control
  • Ensure serializability of schedules through
    protocols that result in serializable schedules

53
Quick Recap
  • Difference between serial and serializable
  • Real DBMS does not test for serializability
  • Very inefficient since transactions are
    continuously arriving
  • Would require a lot of undoing
  • Solution concurrency protocols
  • If followed by every transaction, and if enforced
    by transaction processing system, guarantee
    serializability of schedules in which transaction
    appears

54
Concurrency Control Through Locks
  • Lock variable associated with each data item
  • Describes status of item wrt operations that can
    be performed on it
  • Binary locks Locked/unlocked
  • Enforces mutual exclusion
  • Multiple-mode locks Read/write
  • a.k.a. Shared/Exclusive
  • Three operations
  • read_lock(X)
  • write_lock(X)
  • unlock(X)
  • Each data item can be in one of three lock states

55
Implementation
  • Maintain lock table
  • Keep track of locked items and their locks
  • ltdata item, LOCK, no_of_reads,
    locking_transactiongt
  • For read locks, keep track of the number of
    transactions that hold a read lock on an item

56
Locking Rules
  • T must issue read_lock(X) or write_lock(X) before
    any read_item(X) op is performed in T
  • T must issue write_lock(X) before any
    write_item(X) op is performed in T
  • T must issue unlock(X) after all read_item(X) and
    write_item(X) ops are completed in T
  • T will not issue a read_lock(X) if it already
    holds a read lock or write lock on X (may be
    relaxed)
  • T will not issue a write_lock(X) if it already
    holds a read lock or write lock on X (may be
    relaxed)

57
Lock Conversions
  • Sometimes beneficial to relax locking rules 4 and
    5
  • Upgrade read lock on X to a write lock (by
    issuing a write_lock(X) )
  • Only possible if T is the only transaction
    holding a read lock on X
  • Downgrade a write lock by issuing a read_lock(X)
  • Must be noted in lock table

58
Granting of Locks
  • Suppose T2 has read-lock on item X
  • T1 is requesting write-lock on item X needs to
    wait for T2 to release
  • T3 requests read-lock on X request is granted
  • Assume shortly thereafter T2 relinquishes
    read-lock
  • Continue scenario through a sequence of
    transactions all requesting read-lock on X
  • T1 will never make progress
  • T1 is said to be starved

59
Granting of Locks
  • How do you avoid starvation in the presence of
    locks?
  • Assume Ti requesting lock on Q
  • Grant lock provided that
  • No locking conflict with lock requested by Ti, OR
  • No other transaction waiting for lock and made
    request before Ti

60
Two Transactions
T1 read_lock(Y) read_item(Y) unlock(Y) write_lo
ck(X) read_item(X) XXY write_item(X) unlock
(X)
T2 read_lock(X) read_item(X) unlock(X) write_lo
ck(Y) read_item(Y) YXY write_item(Y) unlock
(Y)
Lets assume serial schedule S1 T1T2 Initial
values X20, Y30 ? Result X50, Y80
61
Locks Alone Dont Do the Trick!
Lets run T1 and T2 in interleafed fashion
Schedule S
T1 read_lock(Y) read_item(Y) unlock(Y)
write_lock(X) read_item(X) XXY write_item(X)
unlock(X)
T2 read_lock(X) read_item(X) unlock(X) write
_lock(Y) read_item(Y) YXY write_item(Y) unl
ock(Y)
unlocked too early!
Non-serializable! Result X50, Y50
62
Two-Phase Locking (2PL)
  • Def. Transaction is said to follow the
    two-phase-locking protocol if all locking
    operations precede the first unlock operation
  • Expanding (growing) first phase
  • Shrinking second phase
  • During the shrinking phase no new locks can be
    acquired!
  • Downgrading ok
  • Upgrading is not

63
Example
T1 read_lock(Y) read_item(Y) write_lock(X) unl
ock(Y) read_item(X) XXY write_item(X) unloc
k(X)
T2 read_lock(X) read_item(X) write_lock(Y) unl
ock(X) read_item(Y) YXY write_item(Y) unloc
k(Y)
  • Both T1 and T2 follow the 2PL protocol
  • Any schedule including T1 and T2 is guaranteed
    to be serializable
  • Limits the amount of concurrency

64
Variations to the Basic Protocol
  • Previous technique knows as basic 2PL
  • Conservative 2PL (static) 2PL Lock all items
    needed BEFORE execution begins by predeclaring
    its read and write set
  • If any of the items in read or write set is
    already locked (by other transactions),
    transaction waits (does not acquire any locks)
  • Deadlock free but not very realistic

65
Variations to the Basic Protocol
  • Strict 2PL Transaction does not release its
    write locks until AFTER it aborts/commits
  • Not deadlock free but guarantees recoverable
    schedules (strict schedule transaction can
    neither read/write X until last transaction that
    wrote X has committed/aborted)
  • Most popular variation of 2PL

66
Variations to the Basic Protocol
  • Rigorous 2PL No lock is released until after
    abort/commit
  • Transaction is in its expanding phase until it
    ends

67
Concluding Remarks
  • Concurrency control subsystem is responsible for
    inserting locks at right places into your
    transaction
  • Strict 2PL is widely used
  • Requires use of waiting queue
  • All 2PL locking protocols guarantee
    serializability
  • Does not permit all possible serial schedules
  • Conservative and rigorous 2PL charge a high price
    for serializability
  • However, deadlock-based algorithms may suffer
    from starvation and deadlock (see next lecture)

68
Concurrency Control Techniques
Lecture Notes
69
Outline of Lecture
  • Two Problems with Locks
  • Deadlock
  • Starvation
  • Concurrency Control Based on Timestamp Ordering

70
What is Deadlock?
  • Occurs when each transaction Ti in a set of two
    or more is waiting on an item locked by some
    other transaction Tj in the set

S
T2 read_lock(X) read_item(X) write_lock(Y)
T1 read_lock(Y) read_item(Y) write_lock(X)
T1
T2
71
Deadlock Prevention
  • Locking as deadlock prevention leads to very
    inefficient schedules (e.g., conservative 2PL)
  • Better, use transaction timestamp TS(T)
  • TS is unique identifier assigned to each
    transaction
  • if T1 starts before T2, then TS(T1) lt TS(T2)
    (older has smaller timestamp value)
  • Wait-die and wound-wait schemes

72
Wait-Die Scheme
  • Assume Ti tries to lock X which is locked by Tj
  • If TS(Ti) lt TS(Tj) (Ti older than Tj), then Ti is
    allowed to wait
  • Otherwise, Ti younger than Tj, abort Ti (Ti dies)
    and restart later with SAME timestamp
  • Older transaction is allowed to wait on younger
    transaction
  • Younger transaction requesting an item held by
    older transaction is aborted and restarted

73
Wound-Wait Scheme
  • Assume Ti tries to lock X which is locked by Tj
  • If TS(Ti) lt TS(Tj) (Ti older than Tj), abort Tj
    (Ti wounds Tj) and restart later with SAME
    timestamp
  • Otherwise, Ti younger than Tj, Ti is allowed to
    wait
  • Younger transaction is allowed to wait on older
    transaction
  • Older transaction requesting item held by younger
    transaction preempts younger one by aborting it
  • Both schemes abort younger transaction that may
    be involved in deadlock
  • Both deadlock free but may cause needless aborts

74
More Deadlock Prevention
  • Waiting schemes (require no timestamps)
  • No waiting if transaction cannot obtain lock,
    aborted immediately and restarted after time t
  • ? needless restarts
  • Cautious waiting
  • Suppose Ti tries to lock item X which is locked
    by Tj
  • If Tj is not blocked, Ti is blocked and allowed
    to wait
  • O.w. abort Ti
  • Cautious waiting is deadlock-free

75
Deadlock Detection
  • DBMS checks if deadlock has occurred
  • Works well if few short transactions with little
    interference
  • O.w., use deadlock prevention
  • Two approaches to deadlock detection
  • Wait-for graph
  • If cycle, abort one of the transactions (victim
    selection)
  • Timeouts

76
Starvation
  • Transaction cannot continue for indefinite amount
    of time while others proceed normally
  • When? Unfair waiting scheme with priorities for
    certain transactions
  • E.g., in deadlock detection, if we choose victim
    always based on cost factors, same transaction
    may always be picked as victim
  • Include rollbacks in cost factor
Write a Comment
User Comments (0)
About PowerShow.com