Title: Transaction Processing
1Transaction Processing
2General Overview
- Where we ve been...
- DBA skills for relational DBs
- Logical Schema Design
- E/R diagrams
- Decomposition and Normalization
- Query languages
- RA, SQL
- Integrity Constraints
- Transaction (today)
- Where we are going
- Database Implementation Issues (from next week)
3 What Does a DBMS Manage?
- 1. Data organization
- E/R Model
- Relational Model
- 2. Data Retrieval
- Relational Algebra
- SQL
- 3. Data Integrity
- Integrity Constraints
- Transactions
4Updates in SQL
An example UPDATE account SET balance balance
-50 WHERE acct_no A102
What takes place
memory
Disk
(1) Read
Dntn A102 300
account
Dntn A15 500
(2) update
- Transaction
- Read(A)
- A lt- A -50
- Write(A)
(3) write
Mian A142 300
5The Threat to Data Integrity
Consistent DB Name Acct bal --------
------ ------ Joe A-33 300 Joe
A-509 100 Joes total 400
Inconsistent DB Name Acct bal --------
------ ------ Joe A-33 250 Joe
A-509 100 Joes total 350
transaction
Consistent DB Name Acct bal --------
------ ------ Joe A-33 250 Joe
A-509 150 Joes total 400
What a Xaction should look like to Joe
What actually happens during execution
6Transactions
- What?
- Updates to db that can be executed concurrently
- Why?
- (1) Updates can require multiple reads,
writes on a db - e.g., transfer 50 from A-33 to A509
read(A) A ? A -50 write(A)
read(B) B?B50 write(B)
(2) For performance reasons, dbs permit updates
to be executed concurrently
Concern concurrent access/updates of data can
compromise data integrity
7ACID Properties
Properties that a Xaction needs to have
- Atomicity either all operations in a Xaction
take effect, or none - Consistency operations, taken together preserve
db consistency - Isolation intermediate, inconsistent states must
be concealed from other Xactions - Durability. If a Xaction successfully completes
(commits), changes made to db must persist,
even if system crashes
8Demonstrating ACID
- Transaction to transfer 50 from account A to
account B - 1. read(A)
- 2. A A 50
- 3. write(A)
- 4. read(B)
- 5. B B 50
- 6. write(B)
Consistency total value AB, unchanged by Xaction
Atomicity if Xaction fails after 3 and before 6,
3 should not affect db
Durability once user notified of Xaction commit,
updates to A,B should not be undone by system
failure
Isolation other Xactions should not be able to
see A, B between steps 3-6
9Threats to ACID
1. Programmer Error e.g. 50 substracted from
A, 30 added to B ? threatens consistency 2.
System Failures e.g. crash after write(A) and
before write(B) ? threatens atomicity e.g. crash
after write(B) ? threatens durability 3.
Concurrency E.g. concurrent Xaction reads A, B
between steps 3-6 ? threatens isolation
10Isolation
- Simplest way to guarantee forbid concurrent
Xactions! - But, concurrency is desirable
- (1) Achieves better throughput (TPS
transactions per second) - one Xaction can use CPU while another is
waiting for disk to service request - (2) Achieves better average response time
- short Xactions dont need to get stuck
behind long ones
Prohibiting concurrency is not an option
11Isolation
- Approach to ensuring Isolation
- Distinguish between good and bad concurrency
- Prevent all bad (and sometime some good)
concurrency from happening OR - Recognize bad concurrency when it happens and
undo its effects (abort some transactions) - Pessimistic vs Optimistic CC
- Both pessimistic and optimistic approaches
require distinguishing between good and bad
concurrency
- How concurrency characterized in terms of
possible Xaction schedules
12Schedules
- Schedules sequences that indicate the
chronological order in which instructions of
concurrent transactions are executed - a schedule for a set of transactions must consist
of all instructions of those transactions - must preserve the order in which the instructions
appear in each individual transaction
T2 A B C D
T1 1 2 3
T1 1 2 3
T2 A B C D
one possible schedule
13Example Schedules
Transactions T1 transfers 50
from A to B T2 transfers 10 of A
to B
Example 1 a serial schedule
T1 read(A) A lt- A -50 write(A) read(B) Blt-B50 wri
te(B)
T2 read(A) tmp lt- A0.1 A lt- A
tmp write(A) read(B) B lt- B tmp write(B)
Constraint The sum of AB must be the same
Before 10050
150, consistent
After 45105
14Example Schedule
T1 read(A) A lt- A -50 write(A) read(B) Blt-
B50 write(B)
T2 read(A) tmp lt- A0.1 A lt- A
tmp write(A) read(B) B lt- B tmp write(B)
Before 10050
150, consistent
After 40110
Consistent but not the same as previous
schedule.. Either is OK!
15Example Schedule (Cont.)
T1 read(A) A lt- A -50 write(A) read(B) Blt-B5
0 write(B)
T2 read(A) tmp lt- A0.1 A lt- A
tmp write(A) read(B) B lt- B tmp write(B)
Effect Before After A
100 45 B 50
105
Same as one of the serial schedules Serializable
16Example Schedules (Cont.)
T1 read(A) A lt- A -50 write(A) read(B) Blt-B
50 write(B)
T2 read(A) tmp lt- A0.1 A lt- A
tmp write(A) read(B) B lt- B tmp write(B)
Before 10050 150
After 5060 110 !!
Not consistent
17Serializability
- How to distinguish good and bad schedules?
- ? for previous example, any schedule leaving AB
150 is good
- Q could we express good schedules in terms of
integrity constraints?
- Ans No. In general, wont know AB, cant check
value of AB at given time - for consistency
Alternative Serializability
18Serializability
Serializable A schedule is serializable if its
effects on the db are the equivalent to some
serial schedule.
Hard to esnure more conservative approaches are
used in practice
All schedules
SQL serializable
Serializable schedules
view serializable schedules
conflict serializable schedules
19Conflict Serializability
- Conservative approximation of serializability
- (conflict serializable gt serializable but
lt doesnt hold) - Idea can we swap the execution order of
consecutive operation - wo/ affecting state of db, Xactions so as to
leave a serial schedule?
20Conflict Serializability (Cont.)
- If a schedule S can be transformed into a
schedule S by a series of swaps of
non-conflicting instructions, we say that S and
S are conflict equivalent. - We say that a schedule S is conflict serializable
if it is conflict equivalent to a serial schedule - Ex
T2 . read(A) . . .
T1 . read(A)
T2 . read(A) . . .
T1 . read(A)
can be rewritten to equivalent schedule
21Conflict Serializability (Cont.)
Example
T1 1. Read(A) 2. A ? A -50 3. Write(A)
4.Read(B) 5. B ? B 50 6. Write(B)
T2 a. Read(A) b. tmp ? A 0.1 c. A ? A - tmp
d.Write(A) e. Read(B) f. B ? B tmp g.
Write(B)
Swaps 4 lt-gtd 4lt-gtc 4lt-gtb 4lt-gta
5lt-gtd 5lt-gtc 5lt-gtb 5lt-gta
6lt-gtd 6lt-gtc 6lt-gtb 6lt-gta
Conflict serializble
T1, T2
22Conflict Serializability (Cont.)
The effects of swaps
T1 read(A) A lt- A -50 write(A) read(B) Blt-B50 wri
te(B)
T2 read(A) tmp lt- A0.1 A lt- A
tmp write(A) read(B) B lt- B tmp write(B)
Because example schedule could be swapped to
this schedule (ltT1, T2gt) example schedule is
conflict serializable
23The swaps we made
A. Reads and writes of different data
elements e.g. T1 T2
T1 T2
write(A) read(B)
read(B)
write(A)
OK because value of B unaffected by write of A
( read(B) has same effect )
write of A is not undone
by read of B ( write(A) has same
effect)
Note T1 T2
T1 T2
write(A) read(A)
read(A)
write(A)
Why? In the first, T1 reads value of A written
by T2. May be different value than
previous value of A
24Swaps
T1 T2
T1 T2
write(A) read(A)
read(A)
write(A)
What affect on state of db could above swap make?
Suppose what follows read(A) in T1 is
read(C)
C ?CA
write(C)
Unless T2 writes the same value to A, the
first schedule will leave a different value for C
than the second
25The swaps We Made
A. Reads and writes of different data elements
4 lt-gt d 6 lt-gt
a
B. Reads of different data elements 4 lt-gt
a C. Writes of different data elements 6 lt-gt
d D. Any operation with a local operation OK
because local operations dont go to disk.
Therefore, unaffected by other
operations 4 lt-gt b 5 lt-gt a
.... 4 lt-gt c To simplify, local
operations are ommited from schedules
26Conflict Serializability (Cont.)
Previous example wo/ local operations
T1 1. Read(A) 2. Write(A) 3. Read(B) 4.
Write(B)
Swaps 3 lt-gtb 3lt-gta 4lt-gtb 4lt-gta
T2 a. Read(A) b. Write(A) c. Read(B) d.
Write(B)
T1, T2
27Swappable Operations
- Swappable operations
- Any operation on different data element
- Reads of the same data (Read(A))
- (regardless of order of reads,
the same value for A is read) - Conflicts
T2 Read(A) T2 Write(A)
OK
T1 Read (A) T1 Write (A)
R/W Conflict
W/R Conflict
W/W Conflict
28Conflicts
- (1) READ/WRITE conflicts
- conflict because value read depends on
whether write has occured - (2) WRITE/WRITE conflicts
- conflict because value left in db depends
on which write occured last - (3) READ/READ no conflict
29Conflict Serializability
Q Is the following shcedule conflict
serializable? If so, whats its equivalent
serial schedule? If not, why?
- T1 T2
- (1) read(Q) write(Q) (a)(2)
write(Q)
Ans No. Swapping (a) with (1) is a R/W conflict,
and swapping (a) with (2) is a W/W conflict.
Not equivalent to ltT1, T2gt or ltT2, T1gt
30Conflict Serializability
Q Is the following shcedule conflict
serializable? If so, whats its equivalent
serial schedule? If not, why?
T1 (1) Read(A) (2) Write(S)
T3 (x) Write(B) (y) Read(S)
T2 (a) Write(A) (b) Read(B)
Ans. NO. All possible serial schedules are not
conflict equivalent. ltT1, T2, T3gt ltT1, T3,
T2gt ltT2, T1, T3gt . . . . . .
31Conflict Serializability
- Testing too expensive to test a schedule by
swapping operations - (usually schedules are big!)
- Alternative Precedence Graphs
- vertices Xactions
- edges conflicts between Xactions
- E.g. Ti ? Tj if (1) Ti, Tj have a
conflicting operation, and - (2) Ti executed
its operation in conflict first
32Precedence Graph
- An example of a Precedence Graph
T1 Read(A)
T3 Write(B) Read(S)
T2 Write(A) Read(B)
T1
R/W(A)
T2
R/W(B)
T3
Q When is a schedule not conflict serializable?
33Precedence Graph
T1 Read(A) Write(S)
T3 Write(B) Read(S)
T2 Write(A) Read(B)
T1
R/W(A)
T2
R/W(B)
R/W(S)
T3
Not conflict serializable!! Because there is a
cycle in the PG, the cycle creates contradiction
34Example Schedule (Schedule A)
- T1 T2 T3 T4 T5
- read(X)read(Y)read(Z) read(V)
read(W) read(Y) write(Y) write(Z)
read(U) read(Y) write(Y) read(Z
) write(Z) - read(U)write(U)
35Precedence Graph for Schedule A
R/W (Y)
T1
T2
R/W(Y)
R/W(y), R/W(Z)
R/W(Z)
R/W(Z) , W/W(Z)
T4
T3
36Test for Conflict Serializability
- A schedule is conflict serializable if and only
if its precedence graph is acyclic. - Cycle-detection algorithms exist which take order
n2 time, where n is the number of vertices in the
graph. (Better algorithms take order n e where
e is the number of edges.) - If precedence graph is acyclic, the
serializability order can be obtained by a
topological sorting of the graph. - For example, a serializability order for Schedule
A would beT5 ? T1 ? T3 ? T2 ? T4 .
37View Serializability
- View Equivalence
- S and S are view equivalent if the
following three conditions are met - 1. For each data item Q, if transaction Ti reads
the initial value of Q in schedule S, then
transaction Ti must, in schedule S, also read
the initial value of Q. - 2. For each data item Q, if transaction Ti reads
the value of Q written by Tj in S, it also does
in S - 3. For each data item Q, the transaction (if any)
that performs the final write(Q) operation in
schedule S must perform the final write(Q)
operation in schedule S. - As can be seen, view equivalence is also based
purely on reads - and writes alone.
38View Serializability (Cont.)
- A schedule S is view serializable it is view
equivalent to a serial schedule. - Example
-
T1 Read(A) Write(A)
T2 Write(A)
T3 Write(A)
Is this schedule view serializable? conflict
serializable?
VS Yes. Equivalent to ltT1, T2, T3gt CS No. PG
has a cycle.
- Every view serializable schedule that is not
conflict serializable has blind writes.
39View Serializability
conflict serializable
(1) We just showed
view serializable
view serializable
(2) We can also show
serializable
40Other Notions of Serializability
- Equivalent to the serial schedule lt T1, T2 gt, yet
is not conflict equivalent or view equivalent to
it. -
-
T1 Read(A) A ? A -50 Write(A) Read(B) B ?
B 50 Write(B)
T2 Read(B) B ? B - 10 Write(B) Read(A) A
? A 10 Write(A)
- Determining such equivalence
- requires analysis of operations
- other than read and write.
41Commit
- Commit at the end of a transaction
T1 R(A) W(A) R(B) W(B) Commit
T2 R(A) W(A) R(B) W(B) Commit
When a transaction terminates commits its changes
first
42Transaction Definition in SQL
- Data manipulation language must include a
construct for specifying the set of actions that
comprise a transaction. - In SQL, a transaction begins implicitly. (for
example select, update or create table) - A transaction in SQL ends by
- Commit work commits current transaction and
begins a new one. - Rollback work causes current transaction to abort.
43Transactions in SQL
- In SQL you can specify three characteristics
- Access mode Read only, read write
- Isolation level
- READ UNCOMMITTED
- READ COMMITTED
- REPEATABLE READ
- SERIALIZABLE
- Diagnostics size
44Transactions in SQL
- Serializable default
- - can read only commited records
- - if T is reading or writing X, no other
Xaction can change X until T commits - - if T is updating a set of records
(identified by WHERE clause), no other Xaction
can change this set until T commits
Idea tradeoff More concurrency gt more
overhead to ensure
valid
schedule. Lower degrees of consistency useful for
gathering approximateinformation about the
database, e.g., statistics for query optimizer.