TimeStamp Ordering Concurrency Control Protocols In Distributed DataBase Systems - PowerPoint PPT Presentation

About This Presentation
Title:

TimeStamp Ordering Concurrency Control Protocols In Distributed DataBase Systems

Description:

T S O TimeStamp Ordering Concurrency Control Protocols In Distributed DataBase Systems 2PL TSO CSC536 Barton Price Table of Contents Introduction: Background ... – PowerPoint PPT presentation

Number of Views:252
Avg rating:3.0/5.0
Slides: 42
Provided by: homepage68
Category:

less

Transcript and Presenter's Notes

Title: TimeStamp Ordering Concurrency Control Protocols In Distributed DataBase Systems


1
TimeStamp Ordering Concurrency Control Protocols
In Distributed DataBase Systems
T S O
2PL
TSO
  • CSC536 Barton Price

2
Table of Contents
  • Introduction
  • Background DDBMS architecture, transaction
    processing, concurrency control.
  • Classic TSO protocols 1981state of the art.
  • TSO performance TSO vs. 2PL performance.
  • TSO today new view on TSO performance New uses
    of TSO in DDBMSs.
  • Conclusions topic summary.
  • References references in appearance order.

3
1. INTRODUCTION
  • Is the TSO protocol a viable option in
    Concurrency Control (CC) for DDBSs?

Quite a few of our CSC536 readings had discussed
(or at least mentioned) CC, but only 3 papers
mentioned the TimeStamp (TSO) approach in CC, and
neither of them presented a study of TSO CC
algorithms.
Our 02/14 reading - Mobile Agent Model for
Transaction Processing in DDBDs (Komiya 2003)
states The traditional approaches are either
using a 2-phase locking protocol (2PL), or using
timestamp-ordering (TSO) --- However, their
proposed method uses a blocking approach.
Our 04/11 reading - RTDBSs and Data Services
(Ramamritham, Son, DiPippo, 2004) - states For
conflict-resolution in RTDBs various
time-cognizant extensions of two-phase locking
(2PL), optimistic and timestamp-based protocols
have been proposed
Our 04/18 reading - A Study of CC in RT, Active
DBSs (Datta Son 2002) - mentions that OCC-TI,
(a Timestamp variant of Optimistic CC protocol-
based on Dynamic Adjustment of Serialization
Order) is shown to perform very well - they
compare it in simulation experiments to 2PL-HP
(2PL-High-Priority), WAIT-50, and their proposed
OCC-APFO (Adaptive Priority Fan Out Optimistic
protocol)
4
2. BACKGROUND
2.1 Distributed DataBase Systems Architecture
  • A distributed database (DDB) is a collection of
    multiple, logically interrelated databases
    distributed over a computer network.
  • A distributed database management system (DDBMS)
    is the software that manages the DDB and provides
    an access mechanism that makes this distribution
    transparent to the users.
  • Distributed database system (DDBS) DDBs
    DDBMS.

5
BACKGROUND cont 2.1 Distributed DataBase
Systems Architecture
Difference between a Centralized DBMS and a Distributed DBMS Environment Difference between a Centralized DBMS and a Distributed DBMS Environment
Centralized DBMS on a Network Distributed DBMS Environment
6
BACKGROUND cont 2.2 Transaction Processing In
DDBSs
  • A transaction is a collection of actions that
    make transformations of system states while
    preserving system consistency.
  • Transaction processing has to ensure
  • Concurrency transparency
  • Failure transparency

7
BACKGROUND cont 2.3 Concurrency Control (CC)
  • CC coordinating concurrent accesses to data
    while preserving concurrency transparency.
  • Main Difficulty preventing DB updates (writes)
    by one user, from interfering with DB retrievals
    (reads) or updates (writes) performed by another.
  • w/o CC in place problems like lost updates and
    inconsistent retrievals.
  • CC in DDBSs harder than in a Centralized DBS
  • Users may access data stored in many different
    nodes.
  • A CC mechanism running at one node cannot
    instantly know the interactions at other nodes.

8
BACKGROUND cont 2.3 Concurrency Control (CC)
cont
  • In Centralized DBMSs, CC is well understood.
  • Studies started in early 1970's By the end of
    the 1970s, the Two-Phase Locking (2PL) approach
    (based on critical sections) has been accepted as
    the standard solution.
  • In DDBS, CC choice was still debated in the early
    80's, and a large number of algorithms was
    proposed.
  • Bernstein Goodman (1981) surveyed the state of
    the art in DDBMS CC, presenting 48 CC methods.
  • Structure and correctness of the algorithms
  • Little emphasis on performance issues.
  • Introduce standard terminology for DDBMS CC
    algorithms and standard model for the DDBMS.

9
BACKGROUND Concurrency Control (CC) cont2.3.1
DDBMS model
  • Each site in a DDBMS is a computer running
  • A transaction manager (TM), and
  • A data manager (DM)
  • Correctness criteria of a CC alg. The users
    expect that
  • each transaction submitted will eventually be
    executed
  • the computation performed by each transaction
    will be the same whether it executes alone (in a
    dedicated system), or in parallel with other
    transactions in a multi-programmed system.

10
BACKGROUND Concurrency Control (CC) cont2.3.1
DDBMS model cont
  • A DDBMS has four components
  • Transactions, TMs, DMs, and data.
  • There are 4 operations defined in interface
    between the Transaction (T) and the TM.
  • BEGIN The TM creates a private workspace for T.
  • READ(X) Data item X is read
  • WRITE(X, new-value) X in T's private workspace
    is updated to new-value
  • END Two-phase commit (2PC) takes place, and T's
    execution is finished.

11
BACKGROUND Concurrency Control (CC) cont2.3.1
CC Problem Decomposition
  • Serializability
  • An execution is serializable if it is
    computationally equivalent to a non-concurrent,
    serial execution.
  • CC problem decomposed into 2 sub-problems
    (BG)
  • Read-write synchronization (rw) and
  • Write-write synchronization (ww)
  • Only 2 types of operations access the stored
    data
  • dm-read and dm-write.
  • Two operations conflict if
  • They operate on the same data item and one is
    dm-write.
  • The conflicts are either rw, or ww.
  • BG determined that all the algorithms that they
    examined were variations of only 2 basic
    techniques2-phase locking (2PL) timestamp
    ordering (TSO).

12
3. CLASSIC TSO PROTOCOLS
  • In Timestamp Ordering (TSO)
  • The serialization order is selected a priori.
  • Transaction execution is forced to obey this
    order.
  • Each transaction is assigned a unique timestamp
    (Ts) by its TM.
  • The TM attaches the Ts to all dm-reads and
    dm-writes issued on behalf of the transaction.
  • DMs process conflicting operations in Ts order.
  • The timestamp of operation O is denoted ts(O).

13
3. Classic TSO Protocols - cont
  • RW and WW Conflicts
  • For rw synchronization 2 operations conflict if
  • Both operate on the same data item, and
  • One is a dm-read and the other is a dm-write
  • For ww synchronization 2 operations conflict if
  • Both operate on the same data item, and
  • Both are dm-writes.

14
3. Classic TSO Protocols - cont3.1 Basic TSO
Implementation
  • An implementation of TSO needs
  • a TSO scheduler (S),
  • a software module that receives dm-reads and
    dm-writes and outputs these operations according
    to the TSO rules.
  • In DDBMSs, for the 2-phase commit (2PC) to work
    properly, prewrites must also be processed
    through the TSO scheduler.
  • The basic TSO implementation distributes the
    schedulers along with the database.
  • In centralized DBMSs, the 2PC can be ignored, and
    thus the basic TSO scheduler is very simple
  • At each DM, and for each data item x stored at
    the DM, the scheduler records the largest
    timestamp of any dm-read(x) or dm-write(x) that
    has been processed.

15
3. Classic TSO Protocols - cont3.1 Basic TSO
Implementation cont
  • In Distributed DBMSs, 2PC is incorporated by
  • timestamping prewrites and accepting / rejecting
    prewrites instead of dm-writes.
  • Once the scheduler (S) accepts a prewrite, it
    must guarantee to accept its corresponding
    dm-write.
  • RW (or WW) synchronization
  • once S accepts a prewrite(x) with stamp TS, and
    until its corresponding dm-write(x) is output, S
    must not output any dm-read(x) or dm-write(x)
    with a timestamp newer than TS.
  • The effect is similar to setting a write-lock on
    data item x for the duration of two-phase commit.
  • To implement the above rules, S buffers dm-reads,
    dm-writes, and prewrites.

16
3. Classic TSO Protocols - cont3.1 Basic TSO
Implementation The Algorithm
  • Let min-r-ts(x) be the min. TS of buffered
    dm-reads, and let min-w-ts(x), min-p-ts(x) be
    defined analogously.
  • RW synchronization
  • ? Let R be a dm-read(x).
  • If ts(r) lt w-ts(x), R is rejected.
  • Else if ts(r) gt min-p-ts(x), R is buffered.
  • Else R is output.
  • ? Let P be a prewrite(x).
  • If ts(p) lt r-ts(x), P is rejected.
  • Else P is buffered.
  • ? Let W be a dm-write(x). W is never rejected.
  • If ts(w) gt min-r-ts(x), W is buffered.
  • Else W is output, and the corresponding prewrite
    is de-buffered.

17
3. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont Buffer emptying for basic
T/O rw synchronization
18
3. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
  • - Author's note -
  • Typo in the previous pseudo-code figure?
  • The pseudocode given for when a R-operation is
    ready is as follows
  • R is ready if it precedes the earliest prewrite
    request if ts(r) lt min-p-ts(x)
  • I found that the following modification is
    needed
  • R is ready if lt the earliest prewrite request
  • if ts(r) lt min-p-ts(x)

19
3. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
  • -- Author's note (cont) --

Time 4 Op P DI 0 DIVal 4
Time 6 Op P DI 0 DIVal 6
Time 5 Op R DI 0
Time 5 Op P DI 0 DIVal 5
Time 5 Op W DI 0 DIVal 5
Time 4 Op W DI 0 DIVal 4
Time 6 Op W DI 0 DIVal 6
Test case ?
Output Time 4 OpW DI 0 DIVal 4
Status SUCCEEDED Conflict No_Conflict
As I expected, the output only shows transaction
4 being executed, even if logically I would
expect transactions 5 and 6 to succeed as well.
20
3. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
  • -- Author's note (cont) --
  • Therefore, I made the following correction to the
    source code
  • R is ready if it precedes the earliest prewrite
    request
  • if ts(R) lt min-P-ts(x)
  • New Output

Time 4 Op W DIVal 4 SUCCESS No_Conflict
Time 5 Op R DIVal 4 SUCCESS RW_Conflict
Time 5 Op W DIVal 5 SUCCESS RW_Conflict
Time 6 Op W DIVal 6 SUCCESS No_Conflict
21
3. Classic TSO Protocols - 3.1 Basic TSO
Implementation cont
  • RW synchronization
  • ? Let P be a prewrite(x).
  •     if ts(p) lt w-ts(x), P is rejected
  •     else P is buffered.
  • ? Let W be a dm-write(x). W is never rejected.
  •     if ts(w) gt min-p-ts(x), W is buffered
  •     else W is output.
  • When W is output, the corresponding
    prewrite is de-buffered. If min-p-ts(x)
    increased, buffered dm-writes are retested to see
    if any can now be output.

22
3.2 The Thomas Write Rule (TWR)
  • Basic TSO can be optimized for ww-synch. using
    TWR
  • Let W be a dm-write(x), and suppose ts(W) lt
    W-ts(x).
  • Instead of rejecting W, it can simply be ignored.
  • TWR applies to dm-writes that try to place
    obsolete data into the DB.
  • TWR guarantees that applying a set of dm-writes
    to x has identical effect as if the dm-writes
    were applied in TS order.
  • If TWR is used, there is no need to incorporate
    2PC into the ww-synchronization algorithm the
    ww-scheduler always accepts prewrites and never
    buffers dm-writes.

23
3.3 Multiversion TSO
  • Basic TSO can be improved for rw-synch. using
    multiversion data items.
  • For each data item x there is a set of r-ts's and
    a set of (w-ts, value) pairs, called Versions.
  • The r-ts's of x, record the TSs of all executed
    dm-read(x) operations, and the Versions, record
    the TSs and values of all executed dm-write(x)
    operations.
  • In practice one cannot store r-ts's and versions
    forever Techniques for deleting old versions and
    timestamps are needed.

24
3.4 Conservative TSO
  • Eliminates restarts during TSO scheduling.
  • Requires that each scheduler receive dm-reads (or
    dm-writes) from each TM in TS-order.
  • Since the network is assumed FIFO, this ordering
    is accomplished requiring TMs to send dm-reads
    (or dm-writes) to Schedulers in TS- order.
  • It buffers dm-reads and dm-writes as part of its
    normal operation. When a scheduler buffers an
    operation, it remembers the TM that sent it.

25
3.5 Timestamp Management
  • Common critique of TSO schedulers
  • Too much memory is needed to store timestamps.
  • This can be overcome by "forgetting" old
    timestamps.
  • Timestamps (TSs) are used in Basic-TSO to reject
    operations that "arrive late for example, a
    dm-read(x) with stamp TS1, is rejected if it
    arrives after a dm-write(x) with stamp TS2, where
    TS1lt TS2.
  • In principle, TS1 and TS2 differ by arbitrary
    amount,
  • In practice it is unlikely that TSs will differ
    more than a few minutes.
  • Consequently, TSs can be stored in small tables
    that are periodically purged.

26
3.5 Timestamp Management cont
  • R-ts's are stored in R-table entries of form (x,
    R-ts(x)) for any data item x, there is at most
    one entry.
  • A variable R-min tells the maximum value of any
    TS that has been purged from the table.
  • To update R-ts(x), the S modifies the (x,
    R-ts(x)) entry in the table, if one exists.
    Otherwise, a new entry is created.
  • When the R-table is full, the S selects an
    appropriate value for R-min and deletes all
    entries from the table with smaller timestamp.
  • The W-ts's are managed similarly Analogous
    techniques can be devised for Multiversion TSO
    databases
  • Maintaining TSs for Conservative TSO is even
    cheaper, since it requires only timestamped
    operations, not also timestamped data.

27
3.6 Integrated CC Methods
  • An integrated CC method consists of
  • two components ( rw and ww synchronization
    technique)
  • an interface between the components to ensure
    serializability.
  • Bernstein Goodman list 48 CC methods that can
    be constructed combining the synchronization
    techniques using 2pl and/or TSO techniques, thus
  • Pure 2PL methods,
  • Pure TSO methods, or
  • Methods that combine 2PL and TSO techniques.
  • This presentation discusses only the pure TSO
    techniques.

28
3.6 Integrated CC Methods cont Pure TSO
Method RW technique WW technique
1 Basic T/O Basic T/O
2 Basic T/O Thomas Write Rule (TWR)
3 Basic T/O Multiversion T/O
4 Basic T/O Conservative T/O
5 Multiversion T/O Basic T/O
6 Multiversion T/O TWR
7 Multiversion T/O Multiversion T/O
8 Multiversion T/O Conservative T/O
9 Conservative T/O Basic T/O
10 Conservative T/O TWR
11 Conservative T/O Multiversion T/O
12 Conservative T/O Conservative T/O
29
4. TSO PERFORMANCE
  • Main performance metrics for CC algorithms
  • system throughput
  • transaction response time
  • 4 cost factors influence these metrics
  • inter-site communication
  • local processing
  • transaction restarts
  • transaction blocking
  • The impact of each cost factor varies depending
    on algorithm, system, and application type.
  • BG state at the time they wrote their paper in
    1981
  • "impact detail is not understood yet, and a
    comprehensive quantitative analysis is beyond
    the state of the art".

30
4. TSO PERFORMANCE cont
  • Since 1981 more research was done in Distributed
    CC Performance. 
  • Carey Livny (1988), studied performance for 4
    algorithms - Distributed 2PL, Wound-Wait (WW),
    Basic TSO (BTO), and a Distributed Optimistic
    algorithm (OPT). They
  • Examined for various degrees of contention,
    distributedness of the workload, and data
    replication.
  • Found that 2PL and OPT dominated BTO and WW.
  • Concluded that Optimistic Locking (where
    transactions lock remote copies of data only as
    they enter into the commit protocol, at the risk
    of end-of-transaction deadlocks), is the best
    performer in replicated DBs where messages are
    costly.

31
5. TSO TODAY
  • Until recently, TSO was viewed as the black sheep
    of the family of CC algorithms. This seems to be
    changing
  • NørvÃ¥g at al. (1997), present a simulation study
    to examine the performance and response-times for
    two CC algorithms, TSO and 2PL.
  • Their results show the following
  • For mix of shorts/long transactions (Ts), the
    throughput is significantly higher for TSO than
    for the 2PL scheduler.
  • For short Ts, the performance is almost
    identical.
  • For long Ts, 2PL performs better.
  • The authors comment
  • TSO throughput was not expected to be higher than
    2PL! (most previous studies used centralized DBs)
  • In tests done using the centralized ver. of the
    simulator, 2PL performed indeed better than TSO
    in most cases.

32
5. TSO TODAY cont
  • Srinivasa et al.(2001) takes a new look at the
    TSO CC.
  • They state that during the 80's and the 90's, the
    popular conception has been that TSO techniques
    like Basic-TSO (BTO) perform poorly compared to
    dynamic 2PL (D-2PL)
  • that is because previous studies concentrated on
    centralized DBs and low data contention
    scenarios.
  • This is mainly due to BTOs high reject rate
    (thus, high restart rate) that causes it to reach
    hardware resources limits.
  • But, todays processors are faster and workload
    characteristics have changed.
  • The authors show that BTOs performance is much
    better than D-2PLs for a wide range of
    conditions, especially for high data contention.
  • D-2PL outperforms BTO only when both data
    contention and message latency are low.
    Increasing throughput demands make high data
    contention important, and BTO becomes an
    attractive choice for concurrency control.

33
5. TSO TODAY cont
5.1 New TSO Uses
  • Pacitti et al. (1999, 2001), proposed refreshment
    algorithms (using TSO).
  • address the central problem of maintaining
    replicas consistency in a lazy master replicated
    system.
  • In their related work section they also mention
    other relatively recent work where the authors
    propose 2 new lazy update protocols, also based
    on timestamp ordering

34
5. TSO TODAY 5.1 New TSO Uses cont
  • Jensen Lomet (2001), provide a new approach to
    transaction timestamping, both for Time Choice of
    the timestamp (TS), as well as for the CC
    protocol (combined with 2PL)
  • Their CC method combines TSO with 2PL
  • They delay the choice of a transactions TS,
    until the moment when the TS is needed by a
    statement in the transaction, or until the
    transaction commits.
  • They chose to delay choosing the TS because the
    classical approach of forcing the choice of the
    TS at transaction-start increases the chances
    that TS consistency checking will fail, resulting
    thus in aborts.

35
6. CONCLUSION
  • This presentation discussed the Timestamp
    Ordering (TSO) Concurrency Control (CC), from its
    beginnings (70's) to present time (2000's).
  • The classic TSO methods were described
    thoroughly by Bernstein Goodman a 1981 survey
    of the state of the art in DDBMS CC.
  • BG design a system model and define terminology
    and concepts for a variety of CC algorithms. They
    define the concept of decomposition of CC
    algorithms into read-write and write-write
    synchronization sub-algorithms. They do not focus
    on performance.
  • Main CC Performance metrics are
  • System Throughput
  • Transaction Response Time

36
6. CONCLUSION cont
  • As it was foreseen in the Bernstein Goodman
    1981 survey, many studies about the performance
    of CC algorithms were performed since then (e.g.,
    1988, 1997, and 2001 previously mentioned in this
    presentation).
  • Even if in the 1970's and 1980's TSO was viewed
    as the black sheep of the family of CC algorithms
    from the point of view of performance, this view
    is changing today, due to recent conclusions that
    TSO actually performs better in many situations
    in today's DDBMSs.

37
7. REFERENCES 
  • 1 T. Komiya, H. Ohshida, M. Takizawa. Mobile
    Agent Model for Transaction Processing in
    Distributed Database Systems - Information
    Sciences, v. 154, issue 1-2, Aug 2003.
  • 2 R. Ramakrishnan, J. Gehrke. Database
    Management Systems - McGraw Hill, 2003. Chapter
    22 (Parallel And Distributed Databases)
  • 3 A. Tanenbaum, M. van Steen, Distributed
    Systems Principles and Paradigms -
    Prentice-Hall, 2002. Chapters 1,2 and 5.
  • 4 Philip A. Bernstein, Nathan Goodman,
    Concurrency Control in Distributed Database
    Systems - ACM Computing Surveys (CSUR) Volume
    13, Issue 2, June 1981.

38
7. REFERENCES cont
  • 5 Michael J. Carey, Miron Livny, Distributed
    Concurrency Control Performance A Study of
    Algorithms, Distribution, and Replication,
    Fourteenth International Conference on Very Large
    DataBases (VLDB), August 29 - September 1, 1988,
    Los Angeles, California, USA, Proceedings.
  • 6 Kjetil NørvÃ¥g, Olav SandstÃ¥, and Kjell
    Bratbergsengen, Concurrency Control in
    Distributed Object-Oriented Database Systems,
    Advances in Databases and Information Systems,
    1997

39
7. REFERENCES cont
  • 7 Rashmi Srinivasa, Craig Williams, Paul F.
    Reynolds Jr., A New Look at Timestamp Ordering
    Concurrency Control - Database and Expert Systems
    Applications 12th International Conference, DEXA
    2001 Munich, Germany, September 3-5, 2001,
    Proceedings
  • 8 Jensen, C., and Lomet, D., Transaction
    timestamping in (temporal) databases - VLDB
    Conference, Rome, Italy in Sept. 2001 and
    available at ftp//ftp.research.microsoft.com/user
    s/lomet/pub/temporaltime.pdf

40
7. REFERENCES cont
  • 9 E. Pacitti, P. Minet, and E. Simon. Fast
    algorithms for maintaining replica consistency in
    lazy master replicated databases, VLDB'99 -
    126-137, Proceedings of 25th International
    Conference on Very Large Data Bases, September
    7-10, 1999, Edinburgh, Scotland, UK. Also
    published in Distributed and Parallel Databases,
    9, 237267, 2001, by Kluwer Academic Publishers.
  • 10 Stéphane Gançarski, Hubert Naacke, Esther
    Pacitti, Patrick Valduriez, Parallel Processing
    with Autonomous Databases in a Cluster System,
    Proc. Coopis'2002, Irvine, California

41
7. REFERENCES cont
  • 11 M. Tamer Özsu , Patrick Valduriez,
    Principles of Distributed Database Systems,
    Second Edition, Prentice Hall , ISBN
    0-13-659707-6 , 1999. Notes available at
    http//www.cs.ualberta.ca/database/ddbook.html
  • 12 Gray J. N., How High is High Performance
    Transaction Processing?, Presentation at the High
    Performance Transaction Processing Workshop
    (HPTS99), Asilomar, California, 26-29th Sep 1999.
  • 13 Y. Breitbart, R. Komondoor, R. Rastogi, and
    S. Seshadri, Update propagation protocols for
    replicated databases, ACM SIGMOD Int. Conference
    on Management of Data, Philadelphia, PA, May
    1999.
Write a Comment
User Comments (0)
About PowerShow.com