Title: Distributed Databases DDBs
1Distributed Databases (DDBs)
2Distributed Databases
- Distributed Systems goal
- to offer local DB autonomy at geographically
distributed locations - Multiple CPU's each has DBMS, but data
distributed
3 Advantages of DDBs
- Distributed nature of some DB applications (bank
branches) - Increased reliability and availability if site
failure - also replicate data at gt 1 site - Data sharing but also local control
- Improved performance - smaller DBs exist at each
site
4Disadvantages Increased complexity
- Additional functions needed
- global vs. local queries
- access remote sites, transmit queries and data
- keep track of data and replication
- execution strategies if data at gt 1 site
- which copy to access
- maintain consistency of copies
- recover from site crashes
5Architectures
- Parallel Systems goal to construct a faster
centralized computer - Distributed Systems goal to offer local DB
autonomy at geographically distributed locations
6Parallel DBSs
- Shared-memory multiprocessor
- get N times as much work with N CPU's access
- MIMD, SIMD - equal access to same data, massively
parallel - Parallel shared nothing
- data split among CPUs, each has own CPU, divide
work for transactions, communicate over high
speed networks                LANs -
homogeneous machines                CPU
memory - called a site
7Distributed DBSs (DDBS)
- Distributed DB - share nothing
- lower communication rates
- WAN
- heterogeneous machines
- Homogeneous DDBS
- homogeneous same DBMSs
- Heterogeneous DDBS
- different DBMSs - need ODBC, standard SQL
8Â Â Heterogeneous distributed DBSs HDDBs
- Data distributed and each site has own DBMS
- ORACLE at one site, DB2 at another, etc.
- need ODBC, standard SQL
- usually transaction manager responsible for
cooperation among sites - must coordinate distributed transaction
- need data conversion and to access data at other
sites
9Multidatabase
- autonomous preexisting DBs form new database
- collection of cooperating DBSs that are
heterogeneous - Each DB has its own local users, local
transparency and DBA - very high degree of local autonomy
- need additional interface to allow users to
access global data
10Federated DB
- federated DB is a multidatabase that bridges
heterogeneity fully integrated - keeps a partial view of total schema
- Each DB specifies import/export schema (view)
- appears centralized for local autonomous users
- appears distributed for global users
11DDBS
- Issues in DDBS in slides that follow
12Â To process a query
- Must use data dictionary that includes info on
data distribution among servers - Parse user query
- decomposed into independent site queries
- each site query sent to appropriate server site
- site processes local query, sends result to
result site - result site combines results of subqueries
- Ensure atomicity
13Data fragments
- Can distribute a whole relation at a site
- or
- Data fragments
- logical units of the DB assigned for storage at
various sites - horizontal fragmentation - subset of tuples in
the relation (select) - vertical fragmentation - keeps only certain
attributes of relation (project) need a PK
14Fragments contd
- horizontal fragments
- complete - set of fragments whose conditions
include every tuple - disjoint - tuples only member of 1 fragment
       salary lt 5000 and dno4 - Complete vertical fragment       L1 U L2 U ...
Ln - attributes of R Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Li
intersect Lj PK(R)
15Replication
- Full vs. partial replication
- Which copy to access
- Improves performance for global queries but
updates a problem - Ensure consistency of replicated copies of data
16Example replication/fragmentation
- Example of fragments for company DB Â Â Â Â site 1
- company headquarters gets entire DB
    site 2, 3 horizontal fragments based on
dept. no.
17Concurrency Control and Recovery in Distributed
Databases
- Distributed transactions inherit ACID properties
- problems
- failure at individual sites and communication
links - distributed deadlock (usually timeout)
- distributed commit (use 2PC)
- concurrency control with multiple copies of data
18Distributed Commit
- if updating data at different sites for same
transaction, when to commit? - difficult part - commit together so have
atomicity - Observation - 3 states of basic transactions
           active - can enter other 2 states
           committed - cannot leave this state
           aborted - cannot leave this state
19States
20Commit problems
- To commit a distributed transaction need a
coordinator (originator) site (S1 in our example) - Given transaction T with with components T1 and
T2 - where T1 executes at site S1
- T2 executes at site S2
- a)Â if S1 commits locally, sends commit message
to S2 - BUT if S2 crashes, T2 aborts before completes
subtask not atomic - b)Â if both T1 and T2 ready to commit S1 sends
message to commit - BUT if T2 crashes during commit, not atomic if
cant recover
21To solve problem
- How to solve these problems?Â
- Need an additional state
- prepare state
- when T is done and ready to commit, it goes to
the prepare state - in the prepare state, a transaction can go to
the commit or abort state - (place prepare log on log buffer, forced to log
file - can reconstruct the transaction)
22States
- commit
-
- Active Prepare to commit
- abort
23Use Two-Phase Commit (2PC)
- Used by commercial DBS to achieve coordinated
commit of distributed transactions - Need a coordinator - usually a site that is
participating in the transaction - Only needed if updates
242PC
- Phase 1
- When each subquery concludes, signal coordinator
- If receive done from all, coordinator sends
prepare to commit to all sites - Else coordinate signals abort to all.
- Each site writes changes and Prepare to log
file, send ready to commit, site enters
prepared state - if log fails or cannot commit for other reasons,
send cannot commit, else can commit -
- If dont hear from everyone within a time out
interval assume not OK, send abort to all-
rollback locally -
252PC
- Phase 2
- If all prepare requests successful responses,
coordinator sends commit - Sites record Commit in log, update DB
- If any site response is unsuccessful, coordinator
sends messages to all sites to abort -
rollback locally - If dont hear from everyone within a time out
interval assume not OK, send abort to all-
rollback locally - subquery concluded   ready to commit
(prepared state) - commit record commit
26Solved by 2PC?
- a)Â if S1 commits locally, sends commit to S2
- BUT if S2 crashes, T2 aborts and it is not
atomic - With 2PC
- if T1 completes Transaction
- T1 is prepared
- If T2 crashes
- T2 is unsuccessful
- T1 can still abort from prepared state
27Solved by 2PC?
- b) If both T1 and T2 ready to commit, S1 sends
message to commit - BUT if T2 crashes before commit completed, not
atomic - With 2PC
- if both T1 and T2 successfully complete
Transaction - T2 and T1 are both prepared
- If T2 crashes can recover prepared state and
commit
28Any problems with 2PC?
- A blocking protocol (block until commit or
rollback received) - In most practical applications, blocking caused
by 2PC are rare - Therefore, most systems use 2PC
- However, when could 2PC block?
292PC blocking
- However, when could 2PC block?
- Problem can occur when site in prepared state
- Site voted to commit
- Cannot commit until receive commit from
coordinator - Cannot change vote and abort
- Participant is blocked until receive message from
coordinator - (or coordinator can block while waiting for
replies from participants what about time out?) - Solution?
30Three-phase commit
- 3PC
- Requires an extra round of messages
- Adds another state pre_commit in between ready
to commit (prepared) state and commit state - When in pre-commit state
- can still abort
- can contact other sites
313PC
- Phase 1 the same as 2PC
- Coordinator sends vote_req to all sites
- Sites respond commit or abort
- If one aborts, all abort
- Phase 2
- If all reply vote_commit, coordinator sends
precommit (prepare_to_commit) - Sites that responded commit, wait
- If receive precommit, send ack
- Else abort
323PC
- Phase 3
- Coordinator collects all acks
- Decides commit
- Broadcasts global_commit
- Site receiving global_commit, commits
- subquery concluded   ready to commit
- precommit
- commit record commit
333PC
- Eliminates blocking due to coordinator failure
when - Site sent vote_commit, but not received
prepare_to_commit - Site can abort if do not hear from coordinator
- Site received prepare_to_commit but not
global_commit - Knows everyone voted to commit, knows it will
eventually commit, can act independently,
communicate with other sites
343PC
- If in pre-commit state
- can elect new coordinator if it fails
- New coordinator collects states from sites and
tries to resolve transaction - Otherwise new coordinator tries to establish
quorum
35Quorum based 3PC establishing quorum
- Every site assigned a vote Vi
- Total votes V
- Abort and commit quorum are Va and Vc
- VaVcgt V
- Before commit, must obtain commit quorum
- At least one site in pre-commit and (sites in
wait sites in pre_commit) form quorum - Before abort, must obtain abort quorum
- (Sites in wait pre-abort) for quorum
36Distributed concurrency control using locking
- Centralized vs. distributed solutions
- Definition distinguished copy
- Locks associated with this copy
37Primary site technique
- Primary site technique
- Distinguished copy of all data at this site
- All locks kept at this site, all requests sent
here - Site acts as coordinator for all DB items
- If write lock, DBMS must update all copies
- If read lock from primary site, can access local
copy -
- disadvÂ
- all locking requests sent to same site -
bottleneck - if failure at primary site? must abort and
restart all Ts
38Primary site with backup
- Primary site with backup
- all lock information at both sites, so failure
is less of a problem - disadvÂ
- slower - 2 sites to communicate with
39Primary copy technique
- Primary copy technique - distribute
responsibility - distinguished copies of different items at
different sites - can also choose a new coordinator if a failure -
election to choose self as new site
40Voting
- Voting - no distinguished copy
- lock request sent to all sites with copy of data
- if granted lock by majority of copies, hold lock
and inform sites with copies lock is granted - if don't get a majority, timeout and cancel
request sent to all sites - disadvÂ
- lots of message traffic
41Additional Issues in DDBSs
- Distributed transaction if data distributed
- Transaction manager (TM) knows where data is
stored - TM decides which site starts local components of
transaction - Distributed query processing
42Query Parallelism
- Decompose query into parts that can be executed
in parallel at several sites - Intra query parallelism
- If shared nothing horizontally fragmented
- Select name, phone from account where age gt 65
- Decompose into K different queries
- Result site accepts all and puts together (order
by, count)
43Query Parallelism
- What if a join?
- Difficult problem if table fragments at different
sites - Must get all values of joint attributes at one
site - Then broadcast to relevant sites value of join
attribute - If site 1 has values 1-10 and site 2 has 11-20,
only set to those sites - Result tuples returned, join performed
- Example A XB with tuples A1B1 A2B2
A3B3 tuples A1,A2,A3 sent to S1 S1 sends FK to
B1,B2,B3 sites B1,B2,B3 site send tuples where
PKA1,A2,A3 join performed at S1
44Distributed Query Processing
- If Query (read-only) - no 2PC needed
- If horizontally fragmented
- decompose query into subqueries that can be
processed in parallel - one site (result site) accepts all results and
puts together - Must also do order by, count
45Distributed Query Processing
- If the data is not fragmented
- process whatever operations you can locally
(select, project) - Query can involve joins between data at multiple
sites - Must transfer data over network
- Transfer of data highest cost
- Algorithms to minimize amount of data transferred
(NP-hard)
46Distributed Query Processing
- Assume R1 X R2 -gt Site 3
- with R1 at S1 and R2 at S2
- Can do
- R1 -gt S3 and R2 -gt S3, do join at S3
- R1-gt S2, execute join at S2, result to S3
- R2-gt S1, execute join at S1, result to S3
47Semi-join X
- p dname, lname Dept Xmgrssn Emp
- Assume Dept is R1 and Emp is R2
- 1) project join attribute of Emp at S2 resulting
in E', send to S1. E' is (p ssn Emp) - 2)Â E' X Dept at S1. Project result attributes
and join attributes resulting in E'', send to S2
- E' is (p mgr, dname (Dept Xmgrssn E'))
- 3)Â p dname, lname (E'' Xmgrssn Emp) and send
result to S3
48Distributed Query Processing
- Henver Yao
- Cost
- time to compute results
- time to transmit data
- Local data requires only processing time
- Assume transmission time is most costly
- Goal is to minimize transmission time
49Distributed Query Processing
- Assume you have R3R1XattattR2
- R1 is at S1, R2 at S2 and result relation R3 sent
to result site S - Amount of data to transmit
- Assume R1 is 1M bytes
- R2 is 250K bytes
- R3 is 50k bytes
- Using semi-join
- pattR1 is 100K
- pattR2 is 50K
- pattR1XR2 is 10K
- pattR2XR1 is 25K
50Distributed Query Processing
- Show all possibilities to previous problem and
determine best distributed query processing
strategy
51FDBSs - Data Integration
- Fully-integrated, logical composite of all
constituent databases - Potential problems
- Incompatible data types or query syntax
- Semantically equivalent by differently named
parts of schemas - Use views for data integration
- reference