Title: Real-Time Databases
1Real-Time Databases
- Krithi Ramamritham, Real-Time Databases,
International Journal of Distributed and Parallel
Databases, 1(2), pp. 199-226, 1993. - J. Stankovic, S. H. Son, and J. Hansson, "
Misconceptions About Real-Time Databases," IEEE
Computer, vol. 32, no. 6, pp 29-36, June 1999.
2Outline
- Motivation
- Characteristics of data in RTDB
- Characteristics of transactions in RTDB
- Relations between active DB and RTDB
- Transaction processing in RTDB
- Research issues
3Motivation
- Many applications involve
- time-constrained access to data
- data temporal validity
- examples agile manufacturing, stock trading,
e-commerce, command and control, network
management, target tracking, ... - Requirements
- Timely transaction/query processing
- Use fresh, i.e., temporally consistent, data
4Traditional Databases
- (Traditional) DBs
- Deal with persistent data
- Transactions access (persistent) data, while
maintaining the consistency - Serializability is the correctness criterion
- Support a good throughput and response time
5Background Serializability
- Correctness criterion for concurrent transaction
executions - Why concurrent transactions?
- Better performance than serial executions
- Definition
- A concurrent execution of transactions is
equivalent to a serial execution of the
transactions - A correct concurrent execution of the
transactions produces the same result as they are
executed one at a time
6Background Conflict Serializability
- Two operations conflict if
- they are issued by two transactions
- they access the same data and
- at least one of them is a write
- Two transaction schedules are conflict-equivalent
if all conflicting operations are in the same
order in the two schedules - A concurrent schedule is conflict-serializable if
it is conflict-equivalent to a serial schedule
7Background Conflict Graph
- Conflict graph
- Nodes transactions
- Directed Edges conflicts
- Example schedule S w1(x)r2(x)r3(y)w2(z)r3(z)w3(
y) r1(y) - A schedule is conflict serializable if theres no
cycle in the conflict graph
T1
T3
T2
8Background Concurrency Control - Locking
- A transaction should get a lock on a data before
accessing it - Shared lock More than one transaction can get a
shared lock on a data at the same time - Exclusive lock Only one transaction can get an
exclusive lock on a data at a time - If a data has a shared lock, other transactions
can get a shared lock to read the data - If a data is already locked through either a
shared or exclusive lock, another transaction
cannot get an exclusive lock on the same data -gt
It has to block - This simple mechanism doesnt necessarily support
conflict-serializability
9Background 2PL (Two Phase Locking) for Conflict
Serializability
- A transaction execution can be divided into two
phases - Growing phase The transaction can only acquire
locks - Shrinking phase It can only release locks
- Strict 2PL Hold an exclusive lock until the
transaction commits
locks
10RT systems
- Meet timing constraints
- Deal with temporal data that become outdated
after a certain time - Recall real-time ? fast See the next slide
11Real-time ? Fast
Time-cognizant transaction scheduling
concurrency control required!
12Why RTDB?
- RT applications may deal with many data, e.g.,
for target tracking, agile manufacturing, stock
trading, ... - DB can facilitate
- description of data schemas help avoid
redundancy of data - maintenance of correctness integrity of data
- efficient access to data - indexing
- correct execution of transactions in spite of
concurrency and failures ACID properties
(Atomicity, Consistency, Isolation, Durability)
13RTDB Features
- Not all data are permanent but temporal, e.g.,
sensor data or stock prices - Temporally-correct serializable schedules are a
subset of serializable schedules - Timeliness is more important than correctness
- Tradeoff btwn timeliness serializability
- Tradeoff btwn timeliness atomicity
- Monotonic queries and transactions supported by
the milestone approach - Tradeoff btwn timeliness data temporal
consistency - Data similarity concept
- Adaptive update policy
- Both real-time scheduling database technologies
can be applied to real-time data management
14Data Characteristics in RTDB
- Temporal data consistency Keep track of the real
world status - Absolute consistency btwn the state of the
environment, e.g., manufacturing or market
status, and its reflection in databases - Relative consistency among the temporal data used
to derive other data - Relative consistency of stock price data used to
derive SP500 index
15Absolute consistency
- Denote a temporal data item in RTDB by d (value,
avi, timestamp) - dvalue denotes the current value of d
- dtimestamp denotes the time when the d was
updated - davi denotes ds absolute validity interval,
i.e., length of time interval following
dtimestamp during which d is considered to have
absolute validity - d is absolutely consistent if current time
dtimestamp avi
16Relative Consistency
- Relative consistency set R a set of data used to
derive a new data - Each set R is associated with a relative validity
interval (rvi) - Example
- SP500 index is an average of 500 stock prices
- Target position can be computed using, e.g.,
aircraft heading, air speed, wind speed
direction, barometric pressure, ...
17Relative Consistency
- Assume a data d in R (relative consistency set)
- d has a correct state if
- dvalue is logically consistent satisfy all
integrity constraints - d is temporally consistent
- absolute consistency (current time dtimestamp)
davi - relative consistency For arbitrary d in R,
dtimestamp dtimestamp Rrvi
18Relative Consistency
- Examples
- temperatureavi 5, pressureavi 10, R
temperature, pressure, Rrvi 2 - If current time 100,
- temperature 347, 5, 95 (value, avi,
timestamp) pressure 50, 10, 97 are
temporally consistent - temperature 347, 5, 95 pressure 50, 10,
92 are not because (95-92) gt Rrvi 2, although
temperature and pressure meet the absolute
consistency requirements
19Relative consistency
- At time 100, temperature 347, 5, 95
pressure 50, 10, 92 are not temporally
consistent because (95-92) gt Rrvi 2, although
temperature and pressure meet the absolute
consistency requirements - Is this good?
- Users may expect relative consistency is
satisfied if the absolute consistency of all the
data in R is met! - avi of pressure should be reduced to 5 to meet
the required rvi of 2 and the updates of pressure
and temperature should always be done within 2
time units - A better metric is required! But, not much work
has been done to address this issue!
20Transaction characteristics in RTDB
- Transaction types
- Write-only transactions obtain the real-world
status and write into RTDB (also called sensor
transactions) - Update transactions derive and store new data in
RTDB (also called derived data recomputations) - Read only transactions, i.e., queries
- Read sensor data and compute actuation signals
- User transactions that read temporal data and
read/write non-temporal data
21Transaction characteristics in RTDB
- Example transactions
- Sample wind velocity every 10s
- Update robot positions every 20s
- If temperature gt 100, add coolant to reactor in
10s - If the average stock price of a user portfolio
changes by more than 10, sell the stocks within
5s
22Transaction characteristics in RTDB
- Deadlines
- Hard Negative infinite value upon a deadline
miss - Soft Value decreases as time goes on after the
deadline - Firm No value after the deadline miss
23Transaction characteristics in RTDB
- How often do we need to execute a sensor
transaction to update data x? - Period 0.5 avi(x) Half-half principle
If period avi
avi
x is stale
If period 0.5avi
avi
avi
x is fresh as long as the sensor transaction
finishes within the period
24Transaction characteristics in RTDB
- How often do we need to recompute a derived data?
- More complex
- Ideally, a derived data should be fresh if
recomputed at every rvi - Alternatively impose precedence constraints on
the transactions to confirm with the derived-from
relationship
25Relationship to Active Databases
- Basic building block in active DB Event,
Condition Action (ECA) - On event
- If condition
- Do Action
- Upon the occurrence of the specified event, if
the condition holds, then trigger the specified
action - Good model for triggering periodic/aperiodic
activities based on the events and conditions - Timing constraints are not explicitly considered
26Relationship to Active Databases
- Active DB has necessary features for real-time
data management - Timing constraints should be considered
- Example
- On (10 seconds after initiating landing
preparations) - If (steps are not completed)
- Do (within 5 seconds abort landing)
27Transaction Processing in RTDB
- Key issue predictability
- Will the transaction meet its timing constraint?
- Sources of unpredictability
- Processing hard real-time transactions
- Processing soft real-time transactions
28Sources of unpredictability in DB
- Dependence of transaction exec sequence on data
values - Very hard to predict the worst case exec time
- Avoid to use unbounded loops, recursive or
dynamically constructed data structures - In RTDB, the data items accessed by a transaction
are likely to be known once its functionality in
the controlled environment is known
29Sources of unpredictability in DB
- Data resource conflicts
- Wait for data and resources, e.g., CPU I/O
device - Data consistency requirements exacerbate the
problem - Long blocking due to concurrency control
- Priority inversion
- Deadlock 2PL is not free of deadlock
30Sources of unpredictability in DB
- Dynamic paging I/O
- Demand paging in disk-resident databases
- Very pessimistic worst case where all data need
to be fetched from disk - Disk scheduling buffering
- Main memory databases eliminate these problems
31Aborts, rollbacks, and restarts
- Transaction aborts, rollbacks, and restarts
- A transaction can be aborted and restarted
several times before it commits - Total exec time increases. If total aborts
cannot be controlled, it can be unbounded - Resources time needed to deal with aborts
restarts can be denied to other transactions
32Pre-analysis of transactions
- Get an estimate of a transactions exec time
data/resource requirements - Impossible for complex transactions
- Two-phase transaction exec
- Pre-fetch phase
- A transaction is run once, bringing in the
necessary data into main memory - Access invariance 15 A transactions exec path
does not change due to possible concurrent
changes done to the data by other transactions,
while the transaction is going through its
pre-fetch phase - No writes are performed
- Conflicts with other transactions are not
considered - Determine computation demands
33Pre-analysis of transactions
- Two-phase transaction exec
- Try to guarantee the transaction will commit by
its deadline in the 2nd phase - Ensure the necessary data processing resources
are available at the appropriate times via
planning - If access invariance holds, a transaction will
complete by its deadline - No recovery such as undo is necessary if a
transaction is unable to execute - How much overhead?? Worth it?
34Dealing with Hard Deadlines
- Must meet all deadlines
- Requirements
- Transactions should be periodic
- WCET resource requirements must be determined
- Many restrictions on the structure
characteristic of RT transactions - -gt RT scheduling techniques can be applied
35Dealing with Soft Deadlines
- More leeway
- Most DB applications are not hard but soft
real-time - Meet as many deadlines as possible
- Firm deadline
- Abort a transaction upon its deadline miss
- Dont waste resources for tardy transactions
- Always good? Different application semantics?
- Real-time scheduling and conflict resolution are
required
36Scheduling
- EDF
- Least slack first
- Schedule the transaction with the least slack
(i.e., deadline current time remaining exec.
time) first - High overhead
- Priority changes very often
- Highest value first
- Highest value density (value/exec time)
- How to determine value???
- Longest executed transaction first
37Conflict resolution 2PL variations
- Priority inheritance
- If a high priority is blocked due to a low
priority transaction, a low priority transaction
inherits the high priority - Reduces blocking time however,
- Blocking time Duration of a transaction under
strict 2PL - Priority abort
- A high priority transaction aborts a low priority
transaction upon a data conflict - Better real-time performance than priority
inheritance - 2PL-PA/2PL-HP well accepted in RTDB
- Low priority transactions may suffer repeated
aborts and restarts, which can be a problem in,
e.g., e-commerce
38Conflict resolution Optimistic concurrency
control
- Assume theres no data conflict during a
transaction execution - Keep executing a transaction
- Upon finishing every operation in a transaction,
enter the validation phase - If validation succeeds, the transaction commits
- Otherwise, it is aborted
39Conflict resolution Optimistic concurrency
control
- Backward validation
- A validating transaction is aborted if it
conflicts with transactions already committed - Characteristics of a validating or ongoing
transactions cannot be considered for conflict
resolution - Forward validation
- A validating transaction aborts ongoing
transactions if theres a conflict - More applicable to RTDB
- Wait-50 A validating transaction blocks as long
as more than half the transactions that conflict
with it have earlier deadlines
40Distributed RTDB
- Very little work has been done
- Challenges
- Transaction commitment protocol, e.g., 2PC (Two
Phase Commit), has high overhead - Unpredictable network delay
- Opportunities
- Data resource availability at remote nodes
- Load balancing
- Fault/intrusion tolerance
41Two Phase Commit (2PC) Protocol
- Supports the integrity in distributed databases
used in, e.g., airline reservation, banking, and
stock trading - All participating databases must either commit or
abort and rollback - Prepare phase Each database informs the
coordinator whether it will commit or abort a
transaction - Commit phase Commit if every database intends to
commit otherwise, abort rollback - Drawback
- If only one database is unavailable, all the
other databases cannot commit - Too much overhead for real-time applications
- Better approaches are required!
42QoS Tradeoff Overload Management
- APPROXIMATE
- Monotonically increase the accuracy of the answer
to a query as more exec time is spent - Provide an approximate answer, if necessary, to
meet the deadline - Epsilon serizability
- Allow transactions to read data while concurrent
writes are going on - Bound the error to be below the specified epsilon
- Timeliness security tradeoff
- Apply a weaker security mechanism under overload
- Good idea?
43Research issues
- QoS guarantees in RTDB
- Transaction timeliness data freshness
- Distributed real-time data management
- Security
- Access control for RTDB?
- New applications
- e-commerce QoS guarantees given dynamic
workloads - Embedded applications Timeliness, data temporal
consistency, energy-efficiency, composability,
security, real-time data-centric routing and
sensor data aggregation, ...
44Questions?