Title: TimeStamp Ordering Concurrency Control Protocols In Distributed DataBase Systems
1TimeStamp Ordering Concurrency Control Protocols
In Distributed DataBase Systems
T S O
2PL
TSO
2Table of Contents
- Introduction
- Background DDBMS architecture, transaction
processing, concurrency control. - Classic TSO protocols 1981state of the art.
- TSO performance TSO vs. 2PL performance.
- TSO today new view on TSO performance New uses
of TSO in DDBMSs. - Conclusions topic summary.
- References references in appearance order.
31. INTRODUCTION
- Is the TSO protocol a viable option in
Concurrency Control (CC) for DDBSs?
Quite a few of our CSC536 readings had discussed
(or at least mentioned) CC, but only 3 papers
mentioned the TimeStamp (TSO) approach in CC, and
neither of them presented a study of TSO CC
algorithms.
Our 02/14 reading - Mobile Agent Model for
Transaction Processing in DDBDs (Komiya 2003)
states The traditional approaches are either
using a 2-phase locking protocol (2PL), or using
timestamp-ordering (TSO) --- However, their
proposed method uses a blocking approach.
Our 04/11 reading - RTDBSs and Data Services
(Ramamritham, Son, DiPippo, 2004) - states For
conflict-resolution in RTDBs various
time-cognizant extensions of two-phase locking
(2PL), optimistic and timestamp-based protocols
have been proposed
Our 04/18 reading - A Study of CC in RT, Active
DBSs (Datta Son 2002) - mentions that OCC-TI,
(a Timestamp variant of Optimistic CC protocol-
based on Dynamic Adjustment of Serialization
Order) is shown to perform very well - they
compare it in simulation experiments to 2PL-HP
(2PL-High-Priority), WAIT-50, and their proposed
OCC-APFO (Adaptive Priority Fan Out Optimistic
protocol)
42. BACKGROUND
2.1 Distributed DataBase Systems Architecture
- A distributed database (DDB) is a collection of
multiple, logically interrelated databases
distributed over a computer network. - A distributed database management system (DDBMS)
is the software that manages the DDB and provides
an access mechanism that makes this distribution
transparent to the users. - Distributed database system (DDBS) DDBs
DDBMS.
5BACKGROUND cont 2.1 Distributed DataBase
Systems Architecture
Difference between a Centralized DBMS and a Distributed DBMS Environment Difference between a Centralized DBMS and a Distributed DBMS Environment
Centralized DBMS on a Network Distributed DBMS Environment
6BACKGROUND cont 2.2 Transaction Processing In
DDBSs
- A transaction is a collection of actions that
make transformations of system states while
preserving system consistency.
- Transaction processing has to ensure
- Concurrency transparency
- Failure transparency
7BACKGROUND cont 2.3 Concurrency Control (CC)
- CC coordinating concurrent accesses to data
while preserving concurrency transparency. - Main Difficulty preventing DB updates (writes)
by one user, from interfering with DB retrievals
(reads) or updates (writes) performed by another.
- w/o CC in place problems like lost updates and
inconsistent retrievals. - CC in DDBSs harder than in a Centralized DBS
- Users may access data stored in many different
nodes. - A CC mechanism running at one node cannot
instantly know the interactions at other nodes.
8BACKGROUND cont 2.3 Concurrency Control (CC)
cont
- In Centralized DBMSs, CC is well understood.
- Studies started in early 1970's By the end of
the 1970s, the Two-Phase Locking (2PL) approach
(based on critical sections) has been accepted as
the standard solution. - In DDBS, CC choice was still debated in the early
80's, and a large number of algorithms was
proposed. - Bernstein Goodman (1981) surveyed the state of
the art in DDBMS CC, presenting 48 CC methods. - Structure and correctness of the algorithms
- Little emphasis on performance issues.
- Introduce standard terminology for DDBMS CC
algorithms and standard model for the DDBMS.
9BACKGROUND Concurrency Control (CC) cont2.3.1
DDBMS model
- Each site in a DDBMS is a computer running
- A transaction manager (TM), and
- A data manager (DM)
- Correctness criteria of a CC alg. The users
expect that - each transaction submitted will eventually be
executed - the computation performed by each transaction
will be the same whether it executes alone (in a
dedicated system), or in parallel with other
transactions in a multi-programmed system.
10BACKGROUND Concurrency Control (CC) cont2.3.1
DDBMS model cont
- A DDBMS has four components
- Transactions, TMs, DMs, and data.
- There are 4 operations defined in interface
between the Transaction (T) and the TM. - BEGIN The TM creates a private workspace for T.
- READ(X) Data item X is read
- WRITE(X, new-value) X in T's private workspace
is updated to new-value - END Two-phase commit (2PC) takes place, and T's
execution is finished.
11BACKGROUND Concurrency Control (CC) cont2.3.1
CC Problem Decomposition
- Serializability
- An execution is serializable if it is
computationally equivalent to a non-concurrent,
serial execution. - CC problem decomposed into 2 sub-problems
(BG) - Read-write synchronization (rw) and
- Write-write synchronization (ww)
- Only 2 types of operations access the stored
data - dm-read and dm-write.
- Two operations conflict if
- They operate on the same data item and one is
dm-write. - The conflicts are either rw, or ww.
- BG determined that all the algorithms that they
examined were variations of only 2 basic
techniques2-phase locking (2PL) timestamp
ordering (TSO).
123. CLASSIC TSO PROTOCOLS
- In Timestamp Ordering (TSO)
- The serialization order is selected a priori.
- Transaction execution is forced to obey this
order. - Each transaction is assigned a unique timestamp
(Ts) by its TM. - The TM attaches the Ts to all dm-reads and
dm-writes issued on behalf of the transaction. - DMs process conflicting operations in Ts order.
- The timestamp of operation O is denoted ts(O).
133. Classic TSO Protocols - cont
- For rw synchronization 2 operations conflict if
- Both operate on the same data item, and
-
- One is a dm-read and the other is a dm-write
- For ww synchronization 2 operations conflict if
- Both operate on the same data item, and
- Both are dm-writes.
143. Classic TSO Protocols - cont3.1 Basic TSO
Implementation
- An implementation of TSO needs
- a TSO scheduler (S),
- a software module that receives dm-reads and
dm-writes and outputs these operations according
to the TSO rules. - In DDBMSs, for the 2-phase commit (2PC) to work
properly, prewrites must also be processed
through the TSO scheduler. - The basic TSO implementation distributes the
schedulers along with the database. - In centralized DBMSs, the 2PC can be ignored, and
thus the basic TSO scheduler is very simple - At each DM, and for each data item x stored at
the DM, the scheduler records the largest
timestamp of any dm-read(x) or dm-write(x) that
has been processed.
153. Classic TSO Protocols - cont3.1 Basic TSO
Implementation cont
- In Distributed DBMSs, 2PC is incorporated by
- timestamping prewrites and accepting / rejecting
prewrites instead of dm-writes. - Once the scheduler (S) accepts a prewrite, it
must guarantee to accept its corresponding
dm-write. - RW (or WW) synchronization
- once S accepts a prewrite(x) with stamp TS, and
until its corresponding dm-write(x) is output, S
must not output any dm-read(x) or dm-write(x)
with a timestamp newer than TS. - The effect is similar to setting a write-lock on
data item x for the duration of two-phase commit. - To implement the above rules, S buffers dm-reads,
dm-writes, and prewrites.
163. Classic TSO Protocols - cont3.1 Basic TSO
Implementation The Algorithm
- Let min-r-ts(x) be the min. TS of buffered
dm-reads, and let min-w-ts(x), min-p-ts(x) be
defined analogously. - RW synchronization
- ? Let R be a dm-read(x).
- If ts(r) lt w-ts(x), R is rejected.
- Else if ts(r) gt min-p-ts(x), R is buffered.
- Else R is output.
- ? Let P be a prewrite(x).
- If ts(p) lt r-ts(x), P is rejected.
- Else P is buffered.
- ? Let W be a dm-write(x). W is never rejected.
- If ts(w) gt min-r-ts(x), W is buffered.
- Else W is output, and the corresponding prewrite
is de-buffered.
173. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont Buffer emptying for basic
T/O rw synchronization
183. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
- - Author's note -
- Typo in the previous pseudo-code figure?
- The pseudocode given for when a R-operation is
ready is as follows - R is ready if it precedes the earliest prewrite
request if ts(r) lt min-p-ts(x) - I found that the following modification is
needed - R is ready if lt the earliest prewrite request
- if ts(r) lt min-p-ts(x)
193. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
- -- Author's note (cont) --
Time 4 Op P DI 0 DIVal 4
Time 6 Op P DI 0 DIVal 6
Time 5 Op R DI 0
Time 5 Op P DI 0 DIVal 5
Time 5 Op W DI 0 DIVal 5
Time 4 Op W DI 0 DIVal 4
Time 6 Op W DI 0 DIVal 6
Test case ?
Output Time 4 OpW DI 0 DIVal 4
Status SUCCEEDED Conflict No_Conflict
As I expected, the output only shows transaction
4 being executed, even if logically I would
expect transactions 5 and 6 to succeed as well.
203. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
- -- Author's note (cont) --
- Therefore, I made the following correction to the
source code - R is ready if it precedes the earliest prewrite
request - if ts(R) lt min-P-ts(x)
- New Output
Time 4 Op W DIVal 4 SUCCESS No_Conflict
Time 5 Op R DIVal 4 SUCCESS RW_Conflict
Time 5 Op W DIVal 5 SUCCESS RW_Conflict
Time 6 Op W DIVal 6 SUCCESS No_Conflict
213. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
- RW synchronization
- ? Let P be a prewrite(x).
- Â Â Â Â if ts(p) lt w-ts(x), P is rejected
- Â Â Â Â else P is buffered.
- ? Let W be a dm-write(x). W is never rejected.
- Â Â Â Â if ts(w) gt min-p-ts(x), W is buffered
- Â Â Â Â else W is output.
- When W is output, the corresponding
prewrite is de-buffered. If min-p-ts(x)
increased, buffered dm-writes are retested to see
if any can now be output.
223.2 The Thomas Write Rule (TWR)
- Basic TSO can be optimized for ww-synch. using
TWR - Let W be a dm-write(x), and suppose ts(W) lt
W-ts(x). - Instead of rejecting W, it can simply be ignored.
- TWR applies to dm-writes that try to place
obsolete data into the DB. - TWR guarantees that applying a set of dm-writes
to x has identical effect as if the dm-writes
were applied in TS order. - If TWR is used, there is no need to incorporate
2PC into the ww-synchronization algorithm the
ww-scheduler always accepts prewrites and never
buffers dm-writes.
233.3 Multiversion TSO
- Basic TSO can be improved for rw-synch. using
multiversion data items. - For each data item x there is a set of r-ts's and
a set of (w-ts, value) pairs, called Versions. - The r-ts's of x, record the TSs of all executed
dm-read(x) operations, and the Versions, record
the TSs and values of all executed dm-write(x)
operations. - In practice one cannot store r-ts's and versions
forever Techniques for deleting old versions and
timestamps are needed.
243.4 Conservative TSO
- Eliminates restarts during TSO scheduling.
- Requires that each scheduler receive dm-reads (or
dm-writes) from each TM in TS-order. - Since the network is assumed FIFO, this ordering
is accomplished requiring TMs to send dm-reads
(or dm-writes) to Schedulers in TS- order. - It buffers dm-reads and dm-writes as part of its
normal operation. When a scheduler buffers an
operation, it remembers the TM that sent it.
253.5 Timestamp Management
- Common critique of TSO schedulers
- Too much memory is needed to store timestamps.
- This can be overcome by "forgetting" old
timestamps. - Timestamps (TSs) are used in Basic-TSO to reject
operations that "arrive late for example, a
dm-read(x) with stamp TS1, is rejected if it
arrives after a dm-write(x) with stamp TS2, where
TS1lt TS2. - In principle, TS1 and TS2 differ by arbitrary
amount, - In practice it is unlikely that TSs will differ
more than a few minutes. - Consequently, TSs can be stored in small tables
that are periodically purged.
263.5 Timestamp Management cont
- R-ts's are stored in R-table entries of form (x,
R-ts(x)) for any data item x, there is at most
one entry. - A variable R-min tells the maximum value of any
TS that has been purged from the table. - To update R-ts(x), the S modifies the (x,
R-ts(x)) entry in the table, if one exists.
Otherwise, a new entry is created. - When the R-table is full, the S selects an
appropriate value for R-min and deletes all
entries from the table with smaller timestamp. - The W-ts's are managed similarly Analogous
techniques can be devised for Multiversion TSO
databases - Maintaining TSs for Conservative TSO is even
cheaper, since it requires only timestamped
operations, not also timestamped data.
273.6 Integrated CC Methods
- An integrated CC method consists of
- two components ( rw and ww synchronization
technique) - an interface between the components to ensure
serializability. - Bernstein Goodman list 48 CC methods that can
be constructed combining the synchronization
techniques using 2pl and/or TSO techniques, thus
- Pure 2PL methods,
- Pure TSO methods, or
- Methods that combine 2PL and TSO techniques.
- This presentation discusses only the pure TSO
techniques.
283.6 Integrated CC Methods cont Pure TSO
Method RW technique WW technique
1 Basic T/O Basic T/O
2 Basic T/O Thomas Write Rule (TWR)
3 Basic T/O Multiversion T/O
4 Basic T/O Conservative T/O
5 Multiversion T/O Basic T/O
6 Multiversion T/O TWR
7 Multiversion T/O Multiversion T/O
8 Multiversion T/O Conservative T/O
9 Conservative T/O Basic T/O
10 Conservative T/O TWR
11 Conservative T/O Multiversion T/O
12 Conservative T/O Conservative T/O
294. TSO PERFORMANCE
- Main performance metrics for CC algorithms
- system throughput
- transaction response time
- 4 cost factors influence these metrics
- inter-site communication
- local processing
- transaction restarts
- transaction blocking
- The impact of each cost factor varies depending
on algorithm, system, and application type. - BG state at the time they wrote their paper in
1981 - "impact detail is not understood yet, and a
comprehensive quantitative analysis is beyond
the state of the art".
304. TSO PERFORMANCE cont
- Since 1981 more research was done in Distributed
CC Performance. - Carey Livny (1988), studied performance for 4
algorithms - Distributed 2PL, Wound-Wait (WW),
Basic TSO (BTO), and a Distributed Optimistic
algorithm (OPT). They - Examined for various degrees of contention,
distributedness of the workload, and data
replication. - Found that 2PL and OPT dominated BTO and WW.
- Concluded that Optimistic Locking (where
transactions lock remote copies of data only as
they enter into the commit protocol, at the risk
of end-of-transaction deadlocks), is the best
performer in replicated DBs where messages are
costly.
315. TSO TODAY
- Until recently, TSO was viewed as the black sheep
of the family of CC algorithms. This seems to be
changing - Nørvåg at al. (1997), present a simulation study
to examine the performance and response-times for
two CC algorithms, TSO and 2PL. - Their results show the following
- For mix of shorts/long transactions (Ts), the
throughput is significantly higher for TSO than
for the 2PL scheduler. - For short Ts, the performance is almost
identical. - For long Ts, 2PL performs better.
- The authors comment
- TSO throughput was not expected to be higher than
2PL! (most previous studies used centralized DBs) - In tests done using the centralized ver. of the
simulator, 2PL performed indeed better than TSO
in most cases.
325. TSO TODAY cont
- Srinivasa et al.(2001) takes a new look at the
TSO CC. - They state that during the 80's and the 90's, the
popular conception has been that TSO techniques
like Basic-TSO (BTO) perform poorly compared to
dynamic 2PL (D-2PL) - that is because previous studies concentrated on
centralized DBs and low data contention
scenarios. - This is mainly due to BTOs high reject rate
(thus, high restart rate) that causes it to reach
hardware resources limits. - But, todays processors are faster and workload
characteristics have changed. - The authors show that BTOs performance is much
better than D-2PLs for a wide range of
conditions, especially for high data contention. - D-2PL outperforms BTO only when both data
contention and message latency are low.
Increasing throughput demands make high data
contention important, and BTO becomes an
attractive choice for concurrency control.
335. TSO TODAY cont
5.1 New TSO Uses
- Pacitti et al. (1999, 2001), proposed refreshment
algorithms (using TSO). - address the central problem of maintaining
replicas consistency in a lazy master replicated
system. - In their related work section they also mention
other relatively recent work where the authors
propose 2 new lazy update protocols, also based
on timestamp ordering
345. TSO TODAY 5.1 New TSO Uses cont
- Jensen Lomet (2001), provide a new approach to
transaction timestamping, both for Time Choice of
the timestamp (TS), as well as for the CC
protocol (combined with 2PL) - Their CC method combines TSO with 2PL
- They delay the choice of a transactions TS,
until the moment when the TS is needed by a
statement in the transaction, or until the
transaction commits. - They chose to delay choosing the TS because the
classical approach of forcing the choice of the
TS at transaction-start increases the chances
that TS consistency checking will fail, resulting
thus in aborts.
356. CONCLUSION
- This presentation discussed the Timestamp
Ordering (TSO) Concurrency Control (CC), from its
beginnings (70's) to present time (2000's). - The classic TSO methods were described
thoroughly by Bernstein Goodman a 1981 survey
of the state of the art in DDBMS CC. - BG design a system model and define terminology
and concepts for a variety of CC algorithms. They
define the concept of decomposition of CC
algorithms into read-write and write-write
synchronization sub-algorithms. They do not focus
on performance. - Main CC Performance metrics are
- System Throughput
- Transaction Response Time
366. CONCLUSION cont
- As it was foreseen in the Bernstein Goodman
1981 survey, many studies about the performance
of CC algorithms were performed since then (e.g.,
1988, 1997, and 2001 previously mentioned in this
presentation). - Even if in the 1970's and 1980's TSO was viewed
as the black sheep of the family of CC algorithms
from the point of view of performance, this view
is changing today, due to recent conclusions that
TSO actually performs better in many situations
in today's DDBMSs.
377. REFERENCESÂ
- 1 T. Komiya, H. Ohshida, M. Takizawa. Mobile
Agent Model for Transaction Processing in
Distributed Database Systems - Information
Sciences, v. 154, issue 1-2, Aug 2003. - 2 R. Ramakrishnan, J. Gehrke. Database
Management Systems - McGraw Hill, 2003. Chapter
22 (Parallel And Distributed Databases) - 3 A. Tanenbaum, M. van Steen, Distributed
Systems Principles and Paradigms -
Prentice-Hall, 2002. Chapters 1,2 and 5. - 4 Philip A. Bernstein, Nathan Goodman,
Concurrency Control in Distributed Database
Systems - ACM Computing Surveys (CSUR) Volume
13, Issue 2, June 1981.
387. REFERENCESÂ cont
- 5 Michael J. Carey, Miron Livny, Distributed
Concurrency Control Performance A Study of
Algorithms, Distribution, and Replication,
Fourteenth International Conference on Very Large
DataBases (VLDB), August 29 - September 1, 1988,
Los Angeles, California, USA, Proceedings. - 6 Kjetil Nørvåg, Olav Sandstå, and Kjell
Bratbergsengen, Concurrency Control in
Distributed Object-Oriented Database Systems,
Advances in Databases and Information Systems,
1997
397. REFERENCESÂ cont
- 7 Rashmi Srinivasa, Craig Williams, Paul F.
Reynolds Jr., A New Look at Timestamp Ordering
Concurrency Control - Database and Expert Systems
Applications 12th International Conference, DEXA
2001 Munich, Germany, September 3-5, 2001,
Proceedings - 8 Jensen, C., and Lomet, D., Transaction
timestamping in (temporal) databases - VLDB
Conference, Rome, Italy in Sept. 2001 and
available at ftp//ftp.research.microsoft.com/user
s/lomet/pub/temporaltime.pdf
407. REFERENCESÂ cont
- 9 E. Pacitti, P. Minet, and E. Simon. Fast
algorithms for maintaining replica consistency in
lazy master replicated databases, VLDB'99 -
126-137, Proceedings of 25th International
Conference on Very Large Data Bases, September
7-10, 1999, Edinburgh, Scotland, UK. Also
published in Distributed and Parallel Databases,
9, 237267, 2001, by Kluwer Academic Publishers. - 10 Stéphane Gançarski, Hubert Naacke, Esther
Pacitti, Patrick Valduriez, Parallel Processing
with Autonomous Databases in a Cluster System,
Proc. Coopis'2002, Irvine, California
417. REFERENCESÂ cont
- 11 M. Tamer Özsu , Patrick Valduriez,
Principles of Distributed Database Systems,
Second Edition, Prentice Hall , ISBN
0-13-659707-6 , 1999. Notes available at
http//www.cs.ualberta.ca/database/ddbook.html - 12 Gray J. N., How High is High Performance
Transaction Processing?, Presentation at the High
Performance Transaction Processing Workshop
(HPTS99), Asilomar, California, 26-29th Sep 1999. - 13 Y. Breitbart, R. Komondoor, R. Rastogi, and
S. Seshadri, Update propagation protocols for
replicated databases, ACM SIGMOD Int. Conference
on Management of Data, Philadelphia, PA, May
1999.