CS603 Data Replication: Advanced - PowerPoint PPT Presentation

About This Presentation
Title:

CS603 Data Replication: Advanced

Description:

Triggering delay on dynamic. Coherency Conditions. Default (always enforced): Value was true once ... Can conditions be checked at nodes? Performance ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 16
Provided by: clif8
Category:

less

Transcript and Presenter's Notes

Title: CS603 Data Replication: Advanced


1
CS603Data Replication Advanced
  • March 1, 2002

2
Data ReplicationWhat havent we Covered?
  • Transparent replication possible
  • Maintain serializability, ACID properties
  • Implemented in real systems
  • But poses performance problems
  • Delays on failure
  • Doesnt support mobility/disconnected operation
  • What if we dont need transparent replication?
  • Transaction serializability not required
  • COTS DBMSs provide alternatives
  • But how do we make sense of them!

3
Problem Formalisms for Relaxed consistency
  • Goal Relaxed consistency constraints
  • Meet application needs
  • Outperform true transparent replication
  • How do we ensure constraints meet needs?
  • Formalisms to describe application needs
  • Methods to prove constraints adequate

4
Quasi-Copies(Alonso, Barbará, Garcia-Molina 90)
  • Data Caching
  • Each site keeps copy of data likely to be used
    locally
  • Propagation cost of writes high
  • User-Defined Cache
  • Controlled Divergence
  • Weak consistency constraints
  • Bounds on the differences between copies
  • User defines constraints

5
Tradeoffs Involved
  • Cost of transmission
  • Delay
  • Failure
  • Vs.
  • Cost of complexity
  • Defining semantics
  • Getting it wrong
  • Cost of consistency checks
  • Processing to see if copy meets constraints

6
Assumptions
  • Read-only copies
  • Updates sent to master copy
  • E.g., ORACLE Materialized View
  • User Specified Coherency
  • Strict limits
  • Hints
  • Example Stock Purchase
  • Place order based on delayed price
  • Limit order to ensure price paid okay

7
Quasi-Caching Definition
  • Central node C
  • Set of objects O
  • Object x ? O has values (attributes)
  • Changes result in versions x(t), latest is v(x)
  • Set of nodes N (C ? N)
  • x Quasi-copy of x ? O
  • xj copy at j ? N
  • Conditions
  • Selection
  • Coherency
  • Access to x on j translates to xj if it exists

8
Selection Conditions
  • Identification clause
  • Select/Project Query
  • SELECT NAME, PRICE
  • FROM STOCKS
  • WHERE TYPE Chemical Company
  • Modifier Clause
  • Add / drop from cache
  • Compulsory or advisory cache
  • Static / Dynamic As new objects meet the
    identification clause, are they cached?
  • Triggering delay on dynamic

9
Coherency Conditions
  • Default (always enforced) Value was true once
  • ?t 0 ?t0 s.t. 0 t0 t and x(t)x(t0)
  • Delay W(x,a) Max time lag
  • ?t 0 ?k s.t. 0 k a and x(t)x(t - k)
  • Version V(x) Number of updates
  • ?t 0 ?k, t0 s.t. 0 k ß, 0 t0 t,v(x(t))
    v(x(t0))k, x(t) x(t0)
  • Periodic P(x) Time for refresh
  • ?t 0 ?n s.t. n 0, anß t lt a(n1)ß,x(t)
    x(anß)

10
Coherency Constraints (cont.)
  • Arithmetic A(x) Bounded Difference
  • ?t 0, x(t) - x(t) lt e
  • ?t 0, ( x(t) - x(t) ) / x(t) lt e
  • Combine conditions with logical operators
  • Multi-object conditions
  • Consistency conditions on a group
  • Order of application in a group

11
Implementation
  • Transmission Delays and Failures
  • C(xj) really C(xj) ? W(xj) d ? C_Failed(j)
  • What to Propagate from Server
  • Data New values
  • Invalidation Message
  • Version number message (without data)
  • Implicit Invalidation (time expired message)

12
Implementation (cont.)
  • When to Propagate
  • Last minute When condition violated
  • Immediately When update occurs
  • Early
  • Delayed Update Dont update central site until
    conditions satisfied
  • Collapsing conditions
  • Server checks conditions
  • How to check efficiently?
  • Load balancing
  • Can conditions be checked at nodes?

13
Performance
  • Performance Model
  • Query processing time at node
  • Update time at node
  • Query processing time at central site
  • Update time at central site
  • Assumes central site 20 speed of node

14
Model Parameters
15
Results
Write a Comment
User Comments (0)
About PowerShow.com