Data Dissemination - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Data Dissemination

Description:

... to read different sets of data objects (spatial and temporal properties) ... Spatial and temporal properties (queries at similar time and from the clients at ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 58
Provided by: CIT788
Category:

less

Transcript and Presenter's Notes

Title: Data Dissemination


1
Data Dissemination
  • Data Dissemination Problems
  • Basic Dissemination Methods Push and Pull
  • Broadcast Disk (for read-only transactions)
  • Basic Schemes for Data Broadcast
  • The Hybrid (Push Pull) Approach
  • Temporal Consistency and Currency
  • Broadcasting Consistent Data
  • Multi-version Data Broadcast
  • Update First with Order (UFO)

2
Proactive Services
2. Infrared sensor detects users ID
Users ID
1. User enters room wearing
3. Display responds
Hello Roy
an active badge
to user
Infrared
Fr. Dollimore
Firstly, when a new object enters a smart space,
the list of services providing in the space have
to be downloaded to the object . How to delivery
the information to the new object? Secondly, the
object may generate requests to be supported by
other objects within the space. How to process
the requests? Thirdly, the object may have
submitted a query/CQ to monitor the status of the
space. How to support the execution of the
query/CQ?
3
Distributed Computing Query Processing
Strategies
  • Query shipping
  • The server (the service provider) maintains the
    latest versions of data objects
  • Queries from clients are sent to a server for
    processing
  • Query results are returned to the clients
  • Data shipping
  • Clients send data requests to the server (the
    service provider)
  • The server returns the requested data objects to
    the clients
  • The queries are processed by the clients

4
Query Shipping Vs. Data Shipping
Query Shipping
2. Query Processing
1. Requests
Client
Server
3. Results
Downlink channel
Uplink channel
3. Query Processing
1. Data requests
Client
Server
2. Data
Data Shipping
What are the tradeoffs? Transmission
overhead/Processing cost/ Scalability/processing
delay? Application and system characteristics,
i.e., data size, number of queries, etc.
5
Query Shipping Vs. Data Shipping
  • Which one is more suitable to pervasive
    computing?
  • A large number of moving objects submit
    (location-dependent) queries to access to
    different types of real-time information
  • Monitor of real-time sensor data using the
    in-network processing approach
  • Lack of powerful nodes at the device (network)
    level
  • Mobile networks
  • Low bandwidth
  • Asymmetric bandwidth uplink bandwidth ltlt
    downlink bandwidth
  • Transaction types
  • Mostly read-only transactions (queries)
  • Sensor readings of environment
  • Continuous queries with a begin and end time for
    event detection (i.e., navigation, tracing)
  • Location-dependent queries at different
    locations request to read different sets of data
    objects (spatial and temporal properties)

6
Data Dissemination Problems
  • Data Shipping How to provide the required data
    items to a large number of queries (from moving
    objects) for execution?
  • Note Since they are read-only operations, no
    need to update any data items at the server.
    Reading (detect events)gt responses

Broadcast data to clients through a mobile network
7
Performance Objectives in Data Dissemination
  • Workload. To minimize the total data transmission
    workload
  • Waiting delay. Some queries may have a deadline
    on their completion time. Meeting the deadline is
    important (to respond to the critical events
    occurred in the system environment)
  • Tune in time (conserve energy)
  • The clients may not know whether their required
    data will be shipped
  • A client may sleep and then weak-up to tune in
    the broadcast channel to get its required data
    item (avoid continuously monitor the broadcast
    channel for data items)
  • Currency. Since most of the queries are asking
    the real-time information of environment (i.e.,
    temperature, traffic condition, etc.), the data
    items provided to a query have to be the latest
    versions (not out-dated)
  • Consistency. To ensure consistency (correctness)
    of data items provided to a query (Temporal
    consistency two data items reporting the status
    of the environment at the same time point).
    Otherwise, incorrect results may be generated

8
Data Dissemination MethodsPull Vs Push
  • Using data shipping for processing of mobile
    queries
  • Scalability (Not to process queries at the server
    and serve each data request one by one). The
    arrival rate of data requests could be very high
    due to large number of mobile clients
  • Suitable for monitoring and surveillance
    applications (continuous queries). Why?
  • How can data shipping be applied to in-network
    processing?
  • Pure Pull (on demand)
  • Clients explicitly (periodically for data
    monitoring) send data requests to a server
    through the uplink channel
  • The server returns the requested data items to
    the clients through the downlink channel
  • Design problem
  • What is the pulling period (based on the dynamic
    properties of the data). Pulling Period Vs.
    transmission workload
  • Scalability problem (although the server does not
    need to process the queries, it needs to serve
    the data requests one by one)

9
Data Dissemination Method - Pulling
Point to point communication
Requests
Client 1
Server
Maintain a request queue and process the data
requests one by one
Data
Client 2
. . . .
Requests
Data
Client n
10
Data Dissemination MethodsPull Vs Push
  • Pure Push (broadcast)
  • Data shipping with prediction
  • To predict what the data requirements of the
    queries are
  • Spatial and temporal properties (queries at
    similar time and from the clients at similar
    locations have similar data requirements)
  • The server defines a broadcast schedule, i.e.,
    based on the popularity of data items (identify
    the hot items using previous access statistics)
  • A broadcast schedule is a sequence of data items
    to be broadcast by the server
  • The server repetitively broadcasts data according
    in a broadcast schedule to a client population
    without receiving any data requests
  • Clients monitors the broadcast channel and
    retrieve the data items which they need as they
    appear in the broadcast channels
  • Application of Push (data broadcast)
  • Listening to radio and watching TV
  • Information feeds such as stock quotes, sport
    tickets, electronic newsletters, traffic and
    weather information, cable TV

11
Data Dissemination Method Pushing
Broadcast data to all the clients (One to many
communication)
Broadcast Sever
The clients monitor the broadcast channel for
their needed data items
Server
Select data for broadcast
12
Comparison Pull Vs. Push (Pull)
  • Pull or Push? Which one is more suitable to
    pervasive computing applications? It depends on
    .
  • Pull requires a higher demand on uplink bandwidth
  • Sending pulling requests and returning data
  • Each data transmission can only serve one query
  • No waste in bandwidth although the bandwidth for
    serving each request is higher
  • All the transmitted data are needed by clients
  • In Pull, a query knows when its required data
    items will come (approximately, why?). The
    clients play an active role. They tell the server
    what they want
  • The workload at the server and the network
    depends on the arrival rate of requests and
    number of clients. If there are many data
    requests, the waiting time will be very longer.
    Missing their completion time?
  • Arrival rate A Service rate C
  • Utilization C/A
  • Queue length U/(1-U)
  • Assuming Poisson arrival and exponential service
    time

13
Comparison Pull Vs. Push (Push)
  • Some data items are not wanted by any queries (a
    waste in bandwidth)
  • It is only a prediction
  • Push is suitable for disseminating data items to
    a large number of clients (more scalable) with
    similar data requirements (hot data items)
  • One data push could meet the data requirements of
    multiple queries (if the prediction is correct)
  • I.e. Many clients may want to know the latest
    traffic condition at cross harbor tunnel
  • I.e. TV broadcast Vs. video on-demand
  • In Push, the total broadcast workload is
    determined by server (I.e., the pushing rate,
    number of data items to be pushing in each
    second)
  • The server may introduce a delay in between two
    broadcast schedule to reduce the broadcast
    workload
  • Push is suitable to systems with small database
    and small size data
  • Push is suitable to systems where the access
    probability of data are non-even (hot data vs.
    cold data)

14
Design Problems in Using Push
  • How to define the broadcast schedule?
  • What is the length of a broadcast schedule?
    (Number of data items) (All items in the
    database?)
  • The access to required data items is sequential
    (a query consists of several read operations and
    the operations are processed one after one.)
  • Clients need continuous listening to the
    downlink channel
  • How to reduce the listening time to conserve
    energy? Doze and weak-up mode of operations

15
Broadcast Index
  • To reduce the monitoring (tune-in) time, an index
    is defined before each broadcast schedule starts
  • A broadcast schedule consists of two parts
  • A header index and a sequence of buckets of data
    items (one bucket one data item, assuming same
    size items)
  • The broadcast index indicates the broadcast
    schedule of the data items in a broadcast cycle
  • From the index, a client can calculate the
    broadcast time of its required data items
    (current time position of data item in the
    broadcast schedule x the time to broadcast a data
    item)
  • Read the broadcast index, and then sleep until
    the required data item is going to be broadcast

16
Broadcast Index
Broadcast Schedule
Index
i
Size/ Broadcast bandwidth
Tune-in
Tune-in
sleep
1 M
2 M
3 M
4 M
5 M
Size
17
Broadcast Schedules
  • A broadcast schedule is a sequence of data items
    (bucket)
  • When a broadcast schedule is finished, the next
    schedule will be defined and then be started
    immediately (or after a fixed delay)
  • The use of different methods to define the
    broadcast schedule affects the waiting time for
    data items
  • Two types of (read-only) queries
  • Each query consists of a set of read operations
  • Unordered The operations can be executed in ANY
    order depending on the arrivals of their required
    data items
  • Ordered The operations have a predefined
    execution sequence. I.e., Query i consists of two
    operations, Readi(x) and Readi(y). It may be
    defined that operation Read(x) has to be
    completed before Read(y) can be started
  • The response time of a query is the time interval
    from its generation time to the time when it
    receives all its required data items (ignoring
    the processing time of the last item)

18
Broadcast Schedules
  • The waiting time for a data item depends on
  • The length of a schedule
  • The position of the data item in a schedule
  • To minimize the mean waiting time of queries
  • Hot data items (popular data items) should be
    broadcast with a higher frequency

Read(i)
Read(j)
Read(k)
Read(i) Read(j) Read(k)
Query x
Query x
19
Broadcast Schedules
Client 1
Client 2
Broadcast Sever
. . . .
Server
Broadcast Schedule
Client n
Index
20
Basic Schemes for Data Broadcast
  • Flat Disk, Skew Disk and Multi-Disk
  • Flat Disk (if it is difficult to identify the
    hot items)
  • A broadcast schedule consists of all the items in
    the database
  • In each broadcast cycle, all the data items in
    the database will be broadcast one after one
    until the end of the database (cycle). Then the
    next cycle will be started from the first item
  • The time to complete one broadcast cycle equals
    to the time to broadcast all data items in the
    database
  • It is suitable for small databases, i.e.,
    broadcast of stock items (currently we have about
    1000 stock items)
  • Not scalable and not suitable for large database
    systems and multimedia broadcast
  • The waiting time of a query for its required data
    items depends on the size of the database and
    their sizes
  • Mean waiting time for a data item is half cycle
    length
  • What will be the waiting time for multiple data
    items?

21
Flat Disk Schedule
  • All the data items in the database (A, B and C)
    are broadcast with the same frequency
  • Could it be?
  • Unordered the operations can be performed in any
    order, i.e., calculation of mean
  • Mean waiting time T/2
  • T is the time to finish one broadcast cycle
  • How about for queries with ordered operations?

22
Skewed broadcast
  • Some data items are identified to be hot data
    items
  • Hot data items should be broadcast with a higher
    frequency since they are more likely waiting by
    queries
  • In skew broadcast, a broadcast schedule consists
    a sub-set of hot data items in the database
  • How to define the length of a broadcast schedule
    and how to choose the data items to be include in
    a broadcast schedule?
  • Order the data items according to their access
    probabilities which are calculated using previous
    access statistics reported from the clients
  • (Some) Mobile clients may be requested to
    generate a access report periodically to report
    the broadcast server (i.e., the market survey of
    a product)

23
Skewed broadcast
  • Design issue
  • Size of a broadcast schedule
  • Calculation of access probability for each item

Access Probability
Select to broadcast
Broadcast schedule size
Increase in access probability
24
Multi-Disk Schedule
  • Divide the data items in the database into
    several groups based on their hot/cold properties
    (access probability)
  • Each group forms a flat disk and the items in the
    same disk have the same broadcast frequency
  • Note that the size of each group needs not to be
    the same
  • The broadcast of data items in the same disk is
    sequential, i.e., like a flat disk
  • Different disks have different broadcast
    frequencies
  • Multiple broadcast disks gt Multi-Disk
  • Changing the disk speeds changes their broadcast
    frequency
  • How to define the broadcast frequencies and the
    schedule?
  • Using the average access probability of the group
    of data items

25
Multi-Disk Schedule
  • Design issue
  • Calculation of access probability for each item
  • Grouping of data items
  • Assigning broadcast frequency

Access Probability
G3
G4
G5
G2
G1
Increase in access probability
26
Multiple Disk Schedule
  • Multiple disks of different sizes and speeds are
    superimposed on the broadcast channel
  • Data item A is a hot data. Its broadcast
    frequency is higher than B and C
  • Could it be?
  • What is the difference?
  • How to interleave the broadcast of cold/hot data
    items so that the inter-arrival time between two
    different instances of the same data item matches
    the clients needs
  • What is the length of a broadcast schedule in
    multi-disk?

27
Data Dissemination Methods
  • On-demand (Pull) broadcast
  • Clients send data requests to the server using
    the uplink channel (if uplink bandwidth is
    available)
  • Server defines the broadcast schedule based on
    the received client requests and the access
    probability of the data items
  • Hybrid using both Push and Pull
  • The down-link channel is divided into two parts
  • Some of the bandwidth is reserved for sending
    data items to clients on demand
  • Some of the bandwidth is for data broadcast
    following the broadcast schedule
  • How much bandwidth should be reserved for
    pulling?
  • How to interleave the service to push and pull?
  • Suitable for queries which need to access
    multiple data items
  • Data requests are only sent after waiting for a
    long time
  • Using on-demand for cold items (data items in
    slow disks)

28
Push and Pull Broadcast Schedules
  • Pre-defined broadcast frequency for each group of
    data items according to applications and access
    statistics
  • How to divide the bandwidth between broadcast
    schedule and on demand schedule?
  • Access statistics
  • Periodic collection of access statistics from
    mobile clients
  • Scheduling of on-demand requests
  • FCFS
  • Earliest deadline first (each query is assigned a
    deadline for completion)
  • Longest waiting time first (the deadline
    intervals of the queries are different)

29
Broadcast Schedules
Broadcast Schedule
push
pull
Client 1
Skew disk
Broadcasting
Client 2
Client 3
Client n
On demand data requests
Prioritization
30
Currency and Consistency in Data Broadcast
  • A query may require to read a set of data items
    with pre-defined sequence
  • The definition of a transaction
  • Consists of a sequence of primitive operations
    embraced between a begin and end markers
  • The operations may be ordered or unordered
    (precedence constraints)

R(x) R(z)
C R(y)
Partial Order R(x) and R(y) may execute
concurrently or in any order
31
Execution Order and Data Broadcast
  • The constraints in execution of the operations in
    a query can greatly increase the waiting time for
    data items. Why?
  • The waiting time for completing a query depends
    on both the broadcast schedule and the execution
    orders of the operations in a query
  • Since the operation Read(z) cannot be performed
    before Read(x) and Read(y), it cannot (does not
    know) read z from the broadcast channel if it has
    not obtained data item x
  • For the worst case, the waiting time is nC (C the
    time to complete one broadcast cycle and n is the
    number of items)
  • The problem will be more serious when we consider
    two additional issues in data dissemination
    currency and consistency

32
Meeting Currency Requirement
  • Update transactions are performed at the database
    server to maintain the freshness of the data
    items in the database (update streams)
  • Sensors periodic generation
  • Location update based on the adopted update
    generation method, i.e., speed-dead reckoning
  • Data conflicts may occur between update
    transactions and mobile queries
  • Update transactions are performed at database
    server to maintain the freshness of data objects
    in the database
  • Reading of data objects (by queries) are occurred
    concurrently

33
Meeting Consistency Requirement
  • Definition data conflict two transactions have
    a data conflict if the first one reads a data
    object and second one updates the same object
    before the commit (completion) of the first one
  • How to resolve data conflicts in a database
    system?
  • The conflict cannot be detected by locking or
    using the conventional concurrency control
    methods
  • Distributed concurrency control problem
  • But, the overhead for locking in a wireless
    network is too heavy
  • How to resolve the disconnection problem after
    granting a lock to a client program
  • Data conflicts in transaction execution may
    result in inconsistent data accesses
  • Generate incorrect results from the transactions

34
Broadcast Schedules
Client 1
Client 2
Broadcast Sever
. . . .
Server
updates
Client n
Index
Broadcast Schedule
35
Concurrent ExecutionInconsistent Retrieval
Problem
Transaction T Bank Withdraw ( A, 100 ) Bank
Deposit ( B, 100)
Transaction U Bank BranchTotal ()
balance A.Read () 200 A.Write (balance
100) 100
balance A.Read () 100 balance balance
B.Read () 300
balance B.Read () 200 B.Write (balance
100) 300
36
Correct Execution of Transactions
  • Schedule shows the execution orders of the
    operations of a set transactions (update and
    mobile transactions)
  • Serial execution (schedule)
  • Execute transactions one after one
  • The next transaction starts only after the
    previous one has been committed or aborted
  • If we have two transactions, we may two different
    serial schedules, I.e., T1 then T2, and T2 then
    T1
  • Always maintain database consistency since all
    transactions start from a consistent database
    state
  • Serial equivalence (serializable)
  • Transactions are executed concurrently but the
    result is equivalent to that of a serial schedule
    of the same set of transactions (which serial
    schedule? Any one)

37
Serial Execution
Transaction T BankWithdraw ( A, 100
) BankDeposit ( B, 100)
Transaction U BankBranchTotal ()
balance A.Read () 200 A.Write (balance
100) 100 balance B.Read () 200 B.Write
(balance 100) 300
balance A.Read () 100 balance balance
B.Read () 300 balance balance C.Read ()
400 .
38
Serial Equivalence
Transaction T BankWithdraw ( A, 4
) BankDeposit ( B, 4)
Transaction U BankWithdraw ( C, 3
) BankDeposit ( B, 3)
balance A.Read () 100 A.Write (balance
4) 96
balance C.Read () 300 C.Write (balance
3) 297
balance B.Read () 200 B.Write (balance
4) 204
balance B.Read () 204 B.Write (balance
3) 207
39
Consistency in Data Broadcast
  • How to determine the correctness in transaction
    execution? I.e., under which situation the
    conflict is harmful
  • Look at the execution order of the conflicting
    operations in a schedule
  • Serialization graph (SG) each edge Ti ? Tj in a
    SG means that at least one of Tis operations
    precede and conflict with one of Tjs operations
  • At the client, a query consists a read operation
    to read a data item x
  • At the server, an update transaction wants to
    update x
  • Serializability theorem
  • A schedule is serializable iff SG(H) is acyclic

40
Consistency in Data Broadcast
  • Example 1 Data conflict between an MT and an
    update transaction
  • Suppose update transaction, U, updates data item
    d5 and then data item d2, and an MT wants to read
    d2 and d5. Remember the update is performed at
    the server and MT is executed at a mobile client.
    If the schedule is
  • Server broadcasts d2
  • MT reads d2
  • U updates d5 d2
  • Server broadcasts d5
  • MT reads d5
  • The MT may observe inconsistent data values. The
    serialization graph is cyclic such as MT -gt U -gt
    MT and is non-serializable
  • The reason is that the MT reads a data item, d2,
    which is in conflict with U before the update
    from U and it reads a conflicting data item, d5,
    after the update from U

41
Consistency in Data Broadcast
  • Example 2 An MT conflicts with two (or more)
    update transactions
  • Even though the serialization order between an
    update transaction and a mobile transaction is
    acyclic, the final serialization graph can still
    be cyclic due to transitive dependencies.
  • Suppose there are two updates U1 and U2 such that
    U1 updates d2 and then d1, and U2 updates d1 and
    then d5. If the schedule is
  • Broadcast transaction (BT) broadcasts d2
  • MT reads d2
  • U1 updates d2 d1
  • U2 updates d1 d5
  • Broadcast transaction (BT) broadcasts d5
  • MT reads d5
  • The serialization graph is cyclic such as U2 -gt
    MT -gt U1 -gt U2

42
How to resolve this problem?
  • The conventional methods for concurrency control
    is not suitable
  • Multiversion Data Broadcast
  • For flat disk only
  • Update with Order First
  • For flat disk, skew disk and multi-disk

43
Multi-Version Data Broadcast
  • Multi-version data broadcast
  • Broadcast multiple versions of a data item
    (current version previous versions). How many
    versions?
  • A Push-based method
  • No uplink data requests
  • Do not need to set any lock or to inform the
    database server before accessing any data items
  • Maintains multiple versions for each data item
  • Each new update create a new version and the old
    versions are still maintained in the system

44
Multi-Version Data Broadcast
  • Providing a consistent view to queries by batch
    updates
  • The updates on data items are batch until the end
    of a broadcast cycle even they arrive in the
    middle of a broadcast cycle
  • During updates, the broadcast of data items is
    suspended
  • The version number indicates at which cycle-end
    the version is created
  • Even with no update, a new version is created
    using the old version at the end of each
    broadcast cycle
  • After the completion of the batch of updates, the
    database is consistent and each newly created
    data version is assigned a cycle number as its
    version number
  • Accessing data versions in MV
  • If a query wants to access to a data object, it
    will get the latest version of the data object
    for its first read operation from the broadcast
    cycle
  • The subsequent read operations of the query will
    read the data objects with the same version
    number of the first operation

45
Multi-Version Data Broadcast
  • How many versions to be broadcast?
  • In MV, it is assumed that each query has a
    maximum life-span and no query exists in the
    system longer than the life-span (L)
  • The life-span can be considered as the deadline
    interval of a query. Start time deadline
    interval deadline
  • After the deadline, the query will be aborted.
    Why?
  • The maximum life-span of the queries together
    with the time required for completing a broadcast
    (BC) is used to calculate the number of versions
    and the versions to be broadcast in a cycle for a
    data item
  • L/BC
  • Assuming the use of flat disk
  • Why? What will be the problem if a skew disk is
    used?

46
Multi-Version Data Broadcast
  • Why is data consistent guaranteed in MV?
  • The update and broadcast of data objects are NOT
    interleaved
  • The view provided in each broadcast cycle is a
    CONSISTENT view at the start time of the
    broadcast cycle. What is the definition of a
    transaction?
  • It is a consistent view since there is no
    incomplete transactions in the system (partially
    completed) at that time point
  • Remember if a transaction starts from an
    consistent view, the database is consistent when
    it is completed (assuming a concurrency control
    method (i.e., 2PL) to resolve the data conflict
    problem among the conflicting transactions

47
Multi-Version Data Broadcast
  • MV data broadcast

48
Multi-Version Data Broadcast
  • MV can be applied for accessing cached data
    objects
  • The clients may maintain the previous versions of
    data items at their caches and the same rule for
    accessing broadcast data is used for accessing
    cached items
  • The multi-version method is very useful for
    systems where the mobile clients are frequently
    disconnected from mobile network

49
Multi-Version Data Broadcast
  • Consistency Vs. Currency
  • Although MV broadcast can ensure consistency of
    data objects provided to a mobile query, the
    currency of data objects is sacrificed
  • Why? Delays (and even skipping) in processing
    updates (batch updates)
  • The latest version of a data object to be
    broadcast in a cycle is the last version before
    the start of the broadcast cycle (how about the
    others)
  • Each data object has to be broadcast at least
    once in each cycle (flat disk). What will happen
    if not?
  • Multiple version broadcast overhead
  • Point consistency Vs. interval consistency
  • MV provides a consistent view of the database
    between the start time and end time of a query
  • How about the problem of continuous queries which
    want to generate results continuously for an
    interval? Some updates are skipped means some
    events are ignored

50
Update-first with Order (UFO)
  • UFO is another algorithm to ensure data
    consistency for mobile queries
  • In UFO, instead of detecting data conflicts
    between mobile queries and update transactions,
    it checks data conflicts between a broadcast
    transaction and an update transaction
  • The broadcast schedule is modeled as a
    transaction (BT)
  • The length of a BT is defined as the max life
    time of a mobile query
  • The basic principle of the UFO algorithm is to
    ensure that if data conflicts occur between a BT
    and an update transaction, the serialization
    order between them will always be U -gt BT
  • Since mobile queries (MT) read data items from
    broadcast transactions, their serialization
    orders are always BT -gt MT
  • Serialization order between the update
    transactions and the mobile queries will always
    be U -gt MT and serializable

51
Update-first with Order (UFO)
  • The execution of an update transaction (at
    server) is divided into two phases the execution
    phase and update phase
  • During the execution phase, an update transaction
    is executed and data conflicts with other update
    transactions will be resolved according to the
    adopted concurrency control protocol
  • The updates of data items are written in a
    private workspace of the transaction during the
    execution phase
  • When all operations of an update transaction have
    been executed, it enters the update phase
  • Permanent updates to the database is performed by
    copying the new values from the private workspace
    into the database
  • During the update phase, the broadcast of data
    items is stopped (BT always observes a consistent
    database)

52
Update-first with Order (UFO)
  • Before an update transaction starts its update
    phase, the system detects data conflict between
    the update transaction and the broadcast
    transactions in the current and previous
    broadcast cycles
  • At the start time of the update phase, the set of
    data items to be updated by the update
    transaction will be known as all its operations
    have been completed
  • At the same time, the set of the data items to be
    read (broadcast) by a broadcast transaction is
    also known as it is resulted from a broadcast
    algorithm
  • The two sets of data items will be compared. If
    they are overlapped, there is a data conflict
  • The conflicting item will be rebroadcast
  • The overhead (re-broadcast) depends on the
    conflict probability

53
Update-first with Order (UFO)
  • BT for any current broadcast
    cycle i
  • OBT the set of data items of broadcast
    transaction, BT
  • OU the set of data items of update
    transaction, U
  • BA x OBT OU x is already broadcast when
    U arrives
  • Before the permanent update starts, the following
    algorithm is performed
  • If OBT OU
  • Then BT and U have no dependency
  • Else
  • If BA
  • Then the serialization order is U -gt BT
  • Else
  • For each data item i BA
  • re-broadcast data item i
  • Next
  • the serialization order is U -gt BT
  • End If
  • End If

54
Update-first with Order (UFO)
55
UFO Example
  • The broadcast transaction (BT) broadcasts d2
  • MT reads d2
  • Compare the data sets of U and BT
  • U updates d5
  • U updates d2
  • BT re-broadcast d2
  • MT reads the most updated value of d2
  • BT continue it process and broadcasts d5
  • MT reads d5
  • The serialization graph is acyclic such as U-gt MT

56
MV and UFO
  • Relative consistency problem
  • MV accessing data with the same versions
  • UFO assigning time-stamps to data versions to
    indicate their validity. The checking is then
    following the requirement of relative consistency
  • Ordered transaction problem
  • MV More versions are needed to be included since
    the query life-span is longer
  • UFO Need to restart a query if the arrival order
    is different from the access order of the data
    items
  • The restart cost can be minimized by caching
    previously accessed data items

57
References
  • Schiller Mobile Communications, Ch 6.1 and 6.2
  • Fundamentals of Mobile and Pervasive Computing,
    Chapter 3
Write a Comment
User Comments (0)
About PowerShow.com