Title: Distributed Concurrency Control
1Distributed Concurrency Control
2Motivation
- World-wide telephone system
- World-wide computer network
- World-wide database system
- Collaborative projects the project has a
database composed of smaller local databases of
each researcher - A travel company organizing vacation it
consults a local subcontractors (local
companies), which list prices and quality ratings
for hotels, restaurants, and fares - A library service people looking for articles
query two or more libraries
3Types of distributed systems
- Homogeneous federation servers participating in
the federation are logically part of a single
system they all run the same suite of protocols,
and they may even be under the control of a
master site - Homogeneous federation is characterized by
distribution transparency - Heterogeneous federation - servers participating
in the federation are autonomous and
heterogeneous they may run different protocols,
and there is no master site
4Types of transactions and schedules
- Local transactions
- Global transactions
5Concurrency Control in Homogeneous Federations
6Preliminaries
- Let the federation consists of n sites, and let T
T1, ..., Tm be a set of global transactions - Let s1, ..., sn be local schedules
- Let D ? Di, where Di is a local database at
site i - We assume no replication (each replica is treated
as a separate data item) - A global schedule for T and s1, ..., sn is a
schedule s for T such that its local projection
equals the local schedule at each site, i.e.
?i(s) si for all i, 1 ? i ? n
7Preliminaries
- ?i(s) denotes the projection of the schedule s
onto site i - We call the projection of a transaction T onto
site i a subtransaction of T (Ti), which
comprises all steps of T at the site i - Global transactions formally have to have Commit
operations at all sites at which they are active - Conflict serializability a global local
schedule is globally locally conflict
serializable if there exists a serial schedule
over the global local (sub-) transactions that
is conflict equivalent to s
8Example 1
- Consider the federation of two sites, where D1
(x) and D2 (y). Then, s1 r1(x) w2(x) and s2
w1(y) r2(y) are local schedules, and - s r1(x) w1(y) w2(x) c1 r2(y) c2
- is a global schedule
- ?1(s) s1 and ?2(s) s2
- Another form of the schedule
- server 1 r1(x) w2(x)
- server 2 w1(y)
r2(y)
9Example 2
- Consider the federation of two sites, where D1
(x) and D2 (y). Assume the following schedule - server 1 r1(x) w2(x)
- server 2 r2(y)
w1(y) - The schedule is not conflict serializable since
the conflict serialization graph will have a
cycle
10Global conflict serializability
- Let s be a global schedule with local schedules
s1, s2, ..., sn involving a set T of transactions
such that each si, 1 ? i ? n, is conflict
serializable. Then, the following holds - s is globally conflict serializable iff there
exists a total order lt on T that is
consistent with the local serialization orders of
the transactions (proof)
11Concurrency Control Algorithms
- Distributed 2PL locking algorithms
- Distributed T/O algorithms
- Distributed optimistic algorithms
12Distributed 2PL locking algorithms
- The main problem is how to determine that a
transaction has reached its lock point? - Primary site 2PL lock management is done
exclusively at a a distinguished site primary
site - Distributed 2PL when a server wants to start
unlocking phase for a transaction, it
communicates with all other servers regarding the
locking point of that transaction - Strong 2PL all locks acquired on behalf of a
transaction are held until the transaction wants
to commit (2PC)
13Distributed T/O algorithms
- Assume that each local site (scheduler) executes
its private T/O protocol for synchronizing
accesses in its portion of the database - server 1 r1(x) w2(x)
- server 2 r2(y)
w1(y) - If timestamps were assigned as in the
centralized case, each of the two servers would
assign a value 1 to the first transaction that it
sees locally T1 on the server 1 and T2 on the
server 2, which would lead to globally incorrect
result
14Distributed T/O algorithms
- We have to find a way to assign globally unique
timestamps to transactions at all sites - Centralized approach a particular server is
responsible for generating and distributing
timestamps - Distributed approach each server generates a
unique local timestamp using a clock or counter - server 1 r1(x) w2(x)
- server 2 r2(y)
w1(y) - TS(T1) (1,1)
- TS(T2) (1,2)
15Distributed T/O algorithms
- Lamport clock used to solve more general
problem of fixing the notion of logical time in
an asynchronous network - Sites communicate through messages
- Logical time is a pair (c, i), where c is
nonnegative integer and i is a transaction number - The clock variable gets increased by 1 at every
transaction operation the logical time of the
operation is defined as the value of the clock
immediately after the operation
16Distributed optimistic algorithms
- Under optimistic approach, every transaction is
processed in three phases - Problem how to ensure that validation comes to
the same resultat every site where a global
transaction has been active - Not implemented
17Distributed Deadlock Detection
- Problem global deadlock, which cannot be
detected by local means only (each server keeps a
WFG locally)
Site 3
Site 1
wait for message
T1
T1
T2
T3
wait for lock
T2
T3
Site 2
18Distributed Deadlock Detection
- Centralized detection centralized monitor
collecting local WFGs - performance
- false deadlocks
- Timeout approach
- Distributed approaches
- Edge chasing
- Path pushing
19Distributed Deadlock Detection
- Edge chasing each transaction that becomes
blocked in a wait relationship sends its
identifier in a special message called probe to
the blocking transaction. If a transaction
receives a probe, it forwards it to all
transactions by which it is itself blocked. If
the probe comes back to the transaction by which
it was initiated this transaction knows that it
is participating in a cycle and hence it is part
of a deadlock
20Distributed Deadlock Detection
- Path pushing entire paths are circulated
between transactions instead of single
transaction identifiers. - The basic algorithm is as follows
- Each server that has a wait-for path from
transaction Ti to transaction Tj such that Ti has
an incoming waits-for message edge and Tj has an
outgoing waits-for message edge sends that path
to the server along the outgoing edge, provided
the identifier of Ti is smaller than that of Tj - Upon receiving a path, the server concatenates
this with the local paths that already exists,
and forwards the result along its outgoing edges
again. If there exists a cycle among n servers,
at least one of them will detect that cycle in at
most n such rounds
21Distributed Deadlock Detection
- Consider the deadlock example
Site 1
Site 2
Site 3
T1
T2
T2
T3
T1
T2
T3
Site 3 knows that T3 T1 locally and
detects global deadlock
22Concurrency Control in Heterogeneous Federations
23Preliminaries
- A heterogeneous distributed database system which
integrates pre-existing external data sources to
support global applications accessing more than
one external data source - HDDBS vs LDBS
- Local autonomy and heterogeneity of local data
sources - Design autonomy
- Communication autonomy
- Execution autonomy
- Local autonomy reflects the fact that local data
sources were designed and implemented
independently and were totally unaware of the
integration process
24Preliminaries
- Design autonomy it refers to the capability of a
database system to choose its own data model and
implementation procedures - Communication autonomy it refers to the
capability of a database system to decide what
other systems it will communicate with and what
information it will exchange with them - Execution autonomy it refers to the capability
of a database system to decide how and when to
execute requests received from other systems
25Difficulties
- Actions of a transaction may be executed in
different EDSs, one of which has system that use
locks to guarantee the serializability, while
another one may use timestamps - Guaranteeing the properties of transactions may
restrict local autonomy, e.g. to guarantee the
atomicity, the participating EDSs must execute
some type of a commit protocol - EDSs may not provide the necessary functionality
to implement the required global coordination
protocols. Ref. To commit protocol, it is
necessary for EDS to become prepared,
guaranteeing that the local actions of a
transaction can be completed. Existing EDSs may
not allow a transaction to enter this state
26HDDBS model
Global transactions
Global Transaction Manager (GTM)
Local Transaction Manager (LTM)
Local Transaction Manager (LTM)
Local transactions
Local transactions
External Data Source EDS2
External Data Source EDS1
27Basic notation
- HDDBS consists of a set D of external data
sources and a set of transactions T - D D1, D2, ..., Dn Di i-th external
data source - ? T ? T1 ? T2 ? ... ? Tn
- T a set of global transactions
- Ti a set of local transactions that access Di
only
28Example
- Given a federation of two servers
- D1 a, b D2 c, d, e Da, b, c, d,
e - Local transactions
- T1 r(a) w(b) T2 w(d) r(e)
- Global transactions
- T3 w(a) r(d) T4 w(b) r(c) w(e)
- Local schedules
- s1 r1(a) w3(a) c3 w1(b) c1 w4(b)
c4 - s2 r4(c) w2(d) r3(d) c3 r2(e) c2
w4(e) c4
29Global schedule
- Let the heterogeneous federation consists of n
sites, and let T1, ..., Tn be sets of local
transactions at sites 1, ..., n, T be a set of
global transactions. Finally, let s1, s2, ...,
sn. - A (heterogeneous) global schedule (for s1, ...,
sn) is a schedule s for
such that its local projection equals the local
schedule at each site, i.e. ?i(s) si for all i,
1 ? i ? n
30Correctness of schedules
- Given a federation of two servers
- D1 a D2 b, c
- Given two global transactions T1 and T2 and a
local transaction T3 - T1 r(a) w(b) T2 w(a) r(c) T3 r(b)
w(c) - Assume the following local schedules
- server 1 r1(a) w2(a)
- server 2 r3(b) w1(b)
r2(c) w3(c) - Transactions T1 and T2 are executed strictly
serially at both sites the global schedule is
not globally serializable
indirect conflict
31Global serializability
- In a heterogeneous federation GTM has no direct
control over local schedules the best it can do
is to control the serialization order of global
transactions by carefully controlling the order
in which operations are sent to local systems for
execution and in which these get acknowledged. - Indirect conflict Ti and Tk are in indirect
conflict in si if there exists a sequence T1,
..., Tr of transactions in si such that Ti is in
si in a direct conflict with T1 Tj is in si in a
direct conflict with Tj1, 1?j?r-1, and Tr is in
si in a direct conflict with Tk - Conflict equivalence two schedules contain the
same operations and the same direct and indirect
conflicts
32Global serializability
- Global Conflict Serialization Graph
- Let s be a global schedule for the local
schedules s1, s2, ..., sn let G(si) denote the
conflict serialization graph of si, - 1 ? i ? n, derived from direct and indirect
conflicts. The global conflict serialization
graph of s is defined as the union of all G(si),
1 ? i ? n, i.e. - Global serializability theorem
- Let the local schedules s1, s2, ..., sn be
given, where each G(si), 1 ? i ? n, is acyclic.
Let s be a global schedule for the si, 1 ? i ? n.
The global schedule s is globally conflict
serializable iff G(s) is acyclic
33Global serializability - problems
- To ensure the global serializability the
serialization order of global transactions must
be the same in all sites they execute - Serialization orders of local schedules must be
validated by the HDDBS - These orders are neither reported by EDSs, nor
- They can be determined by controlling the
submission of the global subtransactions or
observing their execution order
to check
34Example
- Globall non-serializable schedule
- s1 w1(a) r2(a) T1 T2
- s2 w2(c) r3(c) w3(b) r1(b) T2 T3
T1 - Globally serializable schedule
- s1 w1(a) r2(a) T1 T2
- s2 w2(c) r1(b)
- Globall non-serializable schedule
- s1 w1(a) r2(a) T1 T2
- s2 w3(b) r1(b) w2(c) r3(c) T2
T3 T1
35Quasi serializability
- Rejecting global serializability as the
correctness criterion - The basic idea we assume that no value
dependencies exist among EDSs so indirect
conflicts can be ignored - In order to preserve global database consistency,
only global transactions needs to be executed in
a serializable way with proper consideration of
the effects of local transactions
36Quasi serializability
- Quasi-serial schedule
- A set of local schedules s1, ..., sn is quasi
serial if each si is conflict serializable and
there exists a total order lt on the set T of
global transactions such that Ti lt Tj for Ti, Tj
? T, i ? j, implies that in each local schedule
si, 1 ? i ? n, the Ti subtransaction occurs
completely before Tj subtransaction - Quasi serializability
- A set of local schedules s1, ..., sn is quasi
serializable if there exists a set s1, ...,
sn of quasi serial local schedules such that si
is conflict equivalent to si for 1 ? i ? n.
37Example (1)
- Given a federation of two servers
- D1 a, b D2 c, d, e
- Given two global transactions T1 and T2 and two
local transactions T3 and T4 - T1 w(a) r(d) T2 r(b) r(c) w(e)
- T3 r(a) w(b) T4 w(d) r(e)
- Assume the following local schedules
- s1 w1(a) r3(a) w3(b) r2(b)
- s2 r2(c) w4(d) r1(d) w2(e) r4(e)
38Example (2)
- The set s1, s2 is quasi serializable, since it
is conflict equivalent to the quasi serial set
s1, s2, where - s2 w4(d) r1(d) r2(c) w2(e) r4(e)
- The global schedule
- s w1(a) r3(a) r2(c) w4(d) r1(d) c1 w3(b) c3
r2(b) w2(e) c2 r4(e) c4 - is quasi serializable however, s is not
globally serializable - Since the quasi-serialization order is always
compatible with the orderings of subtransactions
in the various local schedules, quasi
serializability is relatively easy to achieve for
a GTM
39Achieving Global Serializability through Local
Guarantees - Rigorousness
- GTM assume that local schedules are conflict
serializable - There are various scenarios for guaranteeing
global serializability - Rigorousness local schedulers produce
conflict-serializable rigorous schedules. The
schedule is rigorous if it satisfies the
following condition - oi(x) lts oj(x), i ? j, oi, oj in conflict
- aj lts oj(x) or cj lts oj(x)
- Schedules in RG avoid any type of rw, wr, or ww
conflict between uncommitted transactions
40Achieving Global Serializability through Local
Guarantees - Rigorousness
- Given a federation of two servers
- D1 a, b D2 c, d
- Given two global transactions T1 and T2 and two
local transactions T3 and T4 - T1 w(a) w(d) T2 w(c) w(b)
- T3 r(a) r(b) T4 r(c) r(d)
- Assume the following local schedules
- s1 w1(a) c1 r3(a) r3(b) c3 w2(b) c2
- s2 w2(c) c2 r4(c) r4(d) c4 w1(d) c1
- Both schedules are rigorous, but they yield
different serialization orders
41Achieving Global Serializability through Local
Guarantees - Rigorousness
- Commit-deferred transactions A global
transaction T is commit-deferred if its commit
operation is sent by GTM to local sites only
after the local executions of all data operations
from T have been acknowledged at all sites - Theorem If si ? RG, 1 ? i ? n, and all global
transactions are commit-deferred, then s is
globally serializable
42Possible solutions
- Bottom-up approach observing the execution of
global transactions at each EDS. - Idea the execution order of global transactions
is determined by their serialization orders at
each EDS - Problem how to determine serialization order of
gl. trans. - Top-down approach controlling the submission and
execution order of global transactions - Idea GTM determines a global serialization
order for global transactions before submitting
them to EDSs. It is EDSs responsibility to
enforce the order at local sites - Problem how the order is enforced at local sites
43Ticket-Based Method
- How GTM can obtain information about relative
order of subtransactions of global transactions
at each EDSs? - How GTM can guarantee that subtransactions of
each global transaction have the same relative
order in all participating EDSs? - Idea to force local direct conflicts between
global transactions or to convert indirect
conflicts (not observable by the GTM) into direct
(observable) conflicts
44Ticket-Based Method
- Ticket a ticket is a logical timestamp whose
value is stored as a special data item in each
EDS - Each subtransaction is required to issue the
Take_A_Ticket operation - r(ticket) w(ticket1) (critical
section) - Only subtransactions of global transactions have
to take tickets - Theorem If global transaction T1 takes its
ticket before global transaction T2 in a server,
then T1 will be serialized before T2 by that
server - or tickets obtained by subtransactions determine
their relative serialization order
45Example (1)
- Given a federation of two servers
- D1 a D2 b, c
- Given two global transactions T1 and T2 and a
local transaction T3 - T1 r(a) w(b) T2 w(a) r(c)
- T3 r(b) w(c)
- Assume the following local schedules
- s1 r1(a) c1 w2(a) c2 T1 T2
- s2 r3(b) w1(b) c1 r2(c) c2 w3(c)
c3 - the schedule is not globally serializable T2
T3 T1
46Example (2)
- Using tickets, the local schedules look as
follows - s1 r1(I1) w1(I11) r1(a) c1 r2(I1)
w2(I11) w2(a) c2 - s2 r3(b) r1(I2) w1(I21) w1(b) c1
r2(I2) w2(I21) r2(c) c2 w3(c) c3 - Indirect conflict between global transactions in
the schedule s2 has been turned into an explicit
one the schedule s2 is not conflict serializable
T3
T2
T1
47Example (3)
- Consider another set of schedules
- s1 r1(I1) w1(I11) r1(a) c1 r2(I1)
w2(I11) w2(a) c2 - s2 r3(b) r2(I2) w2(I21) r1(I2)
w1(I21) w1(b) c1 r2(c) c2 w3(c)
c3 - Now, both schedules are conflict serializable
tickets obtained by transactions determine their
serialization order
48Optimistic ticket method
- Optimistic ticket method (OTM) GTM must ensure
that the subtransactions have the same relative
serialization order in their corresponding EDSs - Idea is to allow the subtransactions to proceed
but to commit them only if their ticket values
have the same relative order in all participating
EDSs - Requirement EDSs must support a visible
prepare_to_commit state for all subtransactions - Prepare_to_commit state is visible if the
application program can decide whether the
transaction should commit or abort
49Optimistic ticket method
- A global transaction T proceed as follows
- GTM sets a timeout for T
- Submits all subtransactions of T to their
corresponding EDSs - If they enter their p_t_c state, they wait for
the GTM to validate T - Commit or abort is broadcasted
- GTM validates T using Ticket graph the graph is
tested for cycles involving T - Problems with OTM
- Global aborts caused by ticket operations
- Probability of global deadlocks increases
-
50Cache Coherence and Concurrency Control for
Data-Sharing Systems
51Architectures for Parallel Distributed Database
Systems
- Three main architectures
- Shared memory systems
- Shared disk systems
- Shared nothing
- Shared memory system multiple CPUs are attached
to an interconnection network, and can access a
common region of main memory - Shared disk system each CPU has a private memory
and direct access to all disks through an
interconnection network - Shared nothing system each CPU has local memory
and disk space, but no two CPUs can access the
same storage area, all communication is through a
network connection
52Shared memory system
P
P
P
P
Interconnection Network
Global Shared Memory
D
D
D
53Shared disk system
M
M
M
M
P
P
P
P
Interconnection Network
D
D
D
54Shared nothing system
Interconnection Network
P
P
P
P
M
M
M
M
D
D
D
D
55Characteristic of architectures
- Shared memory
- is closer to conventional machine, many
commercial DBMS have been ported to this platform - Communication overhead is low
- Memory contention becomes a bottleneck as the
number of CPUs increases - Shared disk similar characteristic
- Interference problem as more CPUs are added,
existing CPUs are slowed down because of the
increased contention for memory access and
network bandwith - A system with 1000 CPU is only 4 as effective as
a single CPU system
56Shared nothing
- It provides almost linear speed-up in that the
time taken for operations decreases in proportion
to the increase in the number of CPUs and disks - It provides almost linear scale-up in that
performance is sustained if the number of CPUs
and disks are increased in proportion to the
amount of data - Powerful parallel database systems can be built
by taking advantage of rapidly improving
performance for single CPU
57Shared nothing
transactions/second
transactions/second
of CPUs
of CPUs and DB size
SCALE-UP with DB SIZE
SPEED-UP
58Concurrency and cache coherency problem
- Data pages can be dynamically replicated in more
than one server cache to exploit access locality - Synchronization of reads and writes requires some
form of distributed lock management and
invalidation of stale copies of data items or
propagation of updated data items must be
communicated among the servers - Basic assumption for data sharing systems each
individual transaction is executed solely on one
server (i.e. transaction does not migrate among
servers during its execution)
59Callback Locking
- We assume that both concurrency control and cache
coherency control are page oriented - Each server has a global lock manager and a local
lock manager - Data items are assigned to global managers in a
static manner (e.g. via hashing), so each global
lock manager is responsible for a fixed subset of
the data items we say that global lock manager
has the global lock authority for a data item - The global lock manager knows for a data item at
each point in time whether the item is locked or
not
60Callback Locking - concurrency control
- When a transaction requests a lock or wants to
release a lock, it first addresses its local lock
manager, which can then contact the global lock
manager - The simplest way is to forward all lock and
unlock requests to the global lock manager that
has the global lock authority for the given data
item - If a lock lock manager is authorized to manage
read lock (or write lock) locally, then it can
save message exchanges with the global lock
manager
61Callback Locking concurrency control
- Local read authority enables local lock manager
to grant local read locks for a data item - Local write authority enables local lock manager
to grant local read/write locks for a data item - A write authority has to be returned to the
corresponding global lock manager if another
server wants to access the data item - A read authority can be held by several servers
simultaneously and has to be returned to the
corresponding global lock manager if another
server wants to access the data item to perform a
write access
62Callback Locking concurrency control
- Cache coherency protocol needs to ensure
- Multiple caches can hold up-to-date versions of a
page simultaneously as long as the page is only
read, and - Once a page has been modified in one of the
caches, this cache is the one that is allowed to
hold a copy of the page - Callback message revokes the local lock authority
63Callback Locking
Server B
Server C
Home(x)
Server A
r1(x)
Rlock(x)
Rlock authority(x)
r2(x)
Rlock(x)
Rlock authority(x)
c1 r3(x) c3
w4(x)
64Callback Locking
Server B
Server C
Home(x)
Server A
c1 r3(x) c3
w4(x)
Wlock(x)
Callback(x)
Callback(x)
OK
c2
OK
Wlock authority(x)