Title: Lecture 10: Transactions
1Lecture 10 Transactions
2The Setting
- Database systems are normally being accessed by
many users or processes at the same time. - Both queries and modifications.
- Unlike operating systems, which support
interaction of processes, a DMBS needs to keep
processes from troublesome interactions.
3Example Bad Interaction
- You and your domestic partner each take 100 from
different ATMs at about the same time. - The DBMS better make sure one account deduction
doesnt get lost. - Compare An OS allows two people to edit a
document at the same time. If both write, ones
changes get lost.
4ACID Transactions
- A DBMS is expected to support ACID
transactions, processes that are - Atomic Either the whole process is done or none
is. - Consistent Database constraints are preserved.
- Isolated It appears to the user as if only one
process executes at a time. - Durable Effects of a process do not get lost if
the system crashes.
5Transactions in SQL
- SQL supports transactions, often behind the
scenes. - Each statement issued at the generic query
interface is a transaction by itself. - In programming interfaces like Embedded SQL or
PSM, a transaction begins the first time a SQL
statement is executed and ends with the program
or an explicit transaction-end.
6COMMIT
- The SQL statement COMMIT causes a transaction to
complete. - Its database modifications are now permanent in
the database.
7ROLLBACK
- The SQL statement ROLLBACK also causes the
transaction to end, but by aborting. - No effects on the database.
- Failures like division by 0 or a constraint
violation can also cause rollback, even if the
programmer does not request it.
8An Example Interacting Processes
- Assume the usual Sells(bar,beer,price) relation,
and suppose that Joes Bar sells only Bud for
2.50 and Miller for 3.00. - Sally is querying Sells for the highest and
lowest price Joe charges. - Joe decides to stop selling Bud and Miller, but
to sell only Heineken at 3.50.
9Sallys Program
- Sally executes the following two SQL statements,
which we call (min) and (max), to help remember
what they do. - (max) SELECT MAX(price) FROM Sells
- WHERE bar Joes Bar
- (min) SELECT MIN(price) FROM Sells
- WHERE bar Joes Bar
10Joes Program
- At about the same time, Joe executes the
following steps, which have the mnemonic names
(del) and (ins). - (del) DELETE FROM Sells
- WHERE bar Joes Bar
- (ins) INSERT INTO Sells
- VALUES(Joes Bar, Heineken, 3.50)
11Interleaving of Statements
- Although (max) must come before (min), and (del)
must come before (ins), there are no other
constraints on the order of these statements,
unless we group Sallys and/or Joes statements
into transactions.
12Example Strange Interleaving
- Suppose the steps execute in the order
(max)(del)(ins)(min). - Joes Prices
- Statement
- Result
- Sally sees MAX lt MIN!
(ins)
13Fixing the Problem by Using Transactions
- If we group Sallys statements (max)(min) into
one transaction, then she cannot see this
inconsistency. - She sees Joes prices at some fixed time.
- Either before or after he changes prices, or in
the middle, but the MAX and MIN are computed from
the same prices.
14Another Problem Rollback
- Suppose Joe executes (del)(ins), not as a
transaction, but after executing these
statements, thinks better of it and issues a
ROLLBACK statement. - If Sally executes her statements after (ins) but
before the rollback, she sees a value, 3.50, that
never existed in the database.
15Solution
- If Joe executes (del)(ins) as a transaction, its
effect cannot be seen by others until the
transaction executes COMMIT. - If the transaction executes ROLLBACK instead,
then its effects can never be seen.
16Isolation Levels
- SQL defines four isolation levels choices
about what interactions are allowed by
transactions that execute at about the same time. - How a DBMS implements these isolation levels is
highly complex, and a typical DBMS provides its
own options.
17Choosing the Isolation Level
- Within a transaction, we can say
- SET TRANSACTION ISOLATION LEVEL X
- where X
- SERIALIZABLE
- REPEATABLE READ
- READ COMMITTED
- READ UNCOMMITTED
18Serializable Transactions
- If Sally (max)(min) and Joe (del)(ins) are
each transactions, and Sally runs with isolation
level SERIALIZABLE, then she will see the
database either before or after Joe runs, but not
in the middle. - Its up to the DBMS vendor to figure out how to
do that, e.g. - True isolation in time.
- Keep Joes old prices around to answer Sallys
queries.
19Isolation Level Is Personal Choice
- Your choice, e.g., run serializable, affects only
how you see the database, not how others see it. - Example If Joe Runs serializable, but Sally
doesnt, then Sally might see no prices for Joes
Bar. - i.e., it looks to Sally as if she ran in the
middle of Joes transaction.
20Read-Commited Transactions
- If Sally runs with isolation level READ
COMMITTED, then she can see only committed data,
but not necessarily the same data each time. - Example Under READ COMMITTED, the interleaving
(max)(del)(ins)(min) is allowed, as long as Joe
commits. - Sally sees MAX lt MIN.
21Repeatable-Read Transactions
- Requirement is like read-committed, plus if data
is read again, then everything seen the first
time will be seen the second time. - But the second and subsequent reads may see more
tuples as well.
22Example Repeatable Read
- Suppose Sally runs under REPEATABLE READ, and the
order of execution is (max)(del)(ins)(min). - (max) sees prices 2.50 and 3.00.
- (min) can see 3.50, but must also see 2.50 and
3.00, because they were seen on the earlier read
by (max).
23Read Uncommitted
- A transaction running under READ UNCOMMITTED can
see data in the database, even if it was written
by a transaction that has not committed (and may
never). - Example If Sally runs under READ UNCOMMITTED,
she could see a price 3.50 even if Joe later
aborts.
24Concurrency Control
DB (consistency constraints)
25Review
- Why do we need transaction?
- Whats ACID?
- Whats SQL support for transaction?
- Whats the four isolation level
- SERIALIZABLE
- REPEATABLE READ
- READ COMMITTED
- READ UNCOMMITTED
26Example
- T1 Read(A) T2 Read(A)
- A ? A100 A ? A?2
- Write(A) Write(A)
- Read(B) Read(B)
- B ? B100 B ? B?2
- Write(B) Write(B)
- Constraint AB
27Schedule A
- T1 T2
- Read(A) A ? A100
- Write(A)
- Read(B) B ? B100
- Write(B)
- Read(A)A ? A?2
- Write(A)
- Read(B)B ? B?2
- Write(B)
-
28Schedule B
- T1 T2
- Read(A)A ? A?2
- Write(A)
- Read(B)B ? B?2
- Write(B)
- Read(A) A ? A100
- Write(A)
- Read(B) B ? B100
- Write(B)
-
29Schedule C
- T1 T2
- Read(A) A ? A100
- Write(A)
- Read(A)A ? A?2
- Write(A)
- Read(B) B ? B100
- Write(B)
- Read(B)B ? B?2
- Write(B)
-
30Schedule D
- T1 T2
- Read(A) A ? A100
- Write(A)
- Read(A)A ? A?2
- Write(A)
- Read(B)B ? B?2
- Write(B)
- Read(B) B ? B100
- Write(B)
-
31Schedule E
Same as Schedule D but with new T2
- T1 T2
- Read(A) A ? A100
- Write(A)
- Read(A)A ? A?1
- Write(A)
- Read(B)B ? B?1
- Write(B)
- Read(B) B ? B100
- Write(B)
-
32- Want schedules that are good, regardless of
- initial state and
- transaction semantics
- Only look at order of read and writes
- Example
- Scr1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B)
33Example Scr1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2
(B)
- Scr1(A)w1(A) r1(B)w1(B)r2(A)w2(A)r2(B)w2(B)
- T1 T2
34Review
- Why do we need transaction?
- Whats ACID?
- Whats SQL support for transaction?
- Whats the four isolation level
- SERIALIZABLE
- REPEATABLE READ
- READ COMMITTED
- READ UNCOMMITTED
35Example
- T1 Read(A) T2 Read(A)
- A ? A100 A ? A?2
- Write(A) Write(A)
- Read(B) Read(B)
- B ? B100 B ? B?2
- Write(B) Write(B)
- Constraint AB
36Schedule D
- T1 T2
- Read(A) A ? A100
- Write(A)
- Read(A)A ? A?2
- Write(A)
- Read(B)B ? B?2
- Write(B)
- Read(B) B ? B100
- Write(B)
-
37- However, for Sd
- Sdr1(A)w1(A)r2(A)w2(A) r2(B)w2(B)r1(B)w1(B)
- as a matter of fact,
- T2 must precede T1
- in any equivalent schedule,
- i.e., T2 ? T1
38- T1 T2 Sd cannot be rearranged
- into a serial schedule
- Sd is not equivalent to
- any serial schedule
- Sd is bad
39Schedule C
- T1 T2
- Read(A) A ? A100
- Write(A)
- Read(A)A ? A?2
- Write(A)
- Read(B) B ? B100
- Write(B)
- Read(B)B ? B?2
- Write(B)
-
40Returning to Sc
- Scr1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B)
- T1 ? T2 T1 ? T2
? no cycles ? Sc is equivalent to a serial
schedule (in this case T1,T2)
41Concepts
- Transaction sequence of ri(x), wi(x) actions
- Conflicting actions r1(A) w2(A) w1(A)
- w2(A) r1(A) w2(A)
- Schedule represents chronological order in
which actions are executed - Serial schedule no interleaving of actions
or transactions
42Definition
- S1, S2 are conflict equivalent schedules
- if S1 can be transformed into S2 by a series of
swaps on non-conflicting actions.
43Definition
- A schedule is conflict serializable if it is
conflict equivalent to some serial schedule.
44Precedence graph P(S) (S is schedule)
- Nodes transactions in S
- Arcs Ti ? Tj whenever
- - pi(A), qj(A) are actions in S
- - pi(A) ltS qj(A)
- - at least one of pi, qj is a write
45Exercise
- What is P(S) forS w3(A) w2(C) r1(A) w1(B)
r1(C) w2(A) r4(A) w4(D) - Is S serializable?
46Another Exercise
- What is P(S) forS w1(A) r2(A) r3(A) w4(A) ?
47Lemma
- S1, S2 conflict equivalent ? P(S1)P(S2)
48- Note P(S1)P(S2) ? S1, S2 conflict equivalent
Counter example S1w1(A) r2(A) w2(B) r1(B)
S2r2(A) w1(A) r1(B) w2(B)
49Theorem
- P(S1) acyclic ?? S1 conflict serializable
(?) Assume S1 is conflict serializable ? ? Ss
Ss, S1 conflict equivalent ? P(Ss) P(S1) ?
P(S1) acyclic since P(Ss) is acyclic
50Theorem
P(S1) acyclic ?? S1 conflict serializable
T1 T2 T3 T4
- (?) Assume P(S1) is acyclic
- Transform S1 as follows
- (1) Take T1 to be transaction with no incident
arcs - (2) Move all T1 actions to the front
- S1 . qj(A).p1(A)..
- (3) we now have S1 lt T1 actions gtlt... rest ...gt
- (4) repeat above steps to serialize rest!