Title: The need for Temporal Databases
1The need for Temporal Databases
(1) Need for coping with temporal data (both VT
and TT)
(2) Just adding 1 (or 2, or 4) temporal
attributes (and maybe some ad-hoc procedures)
does not work!
(3) First, a rigorous semantic framework is
needed, to give formal specification to the
implementation. Properties clearness,
expressiveness, upward compatibiliy. Ex. BCDM
(4) Second, the implementation must be proven to
respect the semantics. Core issue here
efficient (1-NF) implementations hardly grant
uniqueness of representation. Ex TSQL2
2Ad-hoc approaches are complex and are not likely
to work
Example 1. Projection (and temporal coalescing)
Name Salary Title VT_start VT_end
Andrea 60000 Ass. Provost 1/1/1993 30/5/1993
Andrea 70000 Ass. Provost 1/6/1993 30/9/1993
Andrea 70000 Provost 1/10/1993 31/1/1994
Andrea 70000 Professor 1/2/1994 31/12/1994
Question salary history of Andrea Intuitive
SQL query SELECT Salary,VT_start,VT_end FROM
Employee WHERE NameAndrea
3Ad-hoc approaches are complex and are not likely
to work
Result obtained
Salary VT_start VT_end
60000 1/1/1993 30/5/1993
70000 1/6/1993 30/9/1993
70000 1/10/1993 31/1/1994
70000 1/2/1994 31/12/1994
Desired result
Salary VT_start VT_end
60000 1/1/1993 30/5/1993
70000 1/6/1993 31/12/1994
4Ad-hoc approaches are complex and are not likely
to work
How to get the desired result using SQL92
CREATE TABLE Temp(Salary, VT_start,VT_end) AS
SELECT Salary, VT_start, VT_end FROM
Employee WHERE Name Andrea Repeat UPDATE
Temp T1 SET (T1.VT_end)SELECT
MAX(T2.VT_end) WHERE T1.SalaryT2.Salary AND
T1.VT_start lt T2.VT_Start AND T1.VT_end gt
T2.VT_start AND T1.VT_end lt T2.VT_end WHERE
EXISTS (SELECT FROM Temp AS T2 WHERE
T1.SalaryT2.Salary AND T1.VT_start lt
T2.VT_Start AND T1.VT_end gt T2.VT_start
AND T1.VT_end lt T2.VT_end) Until no tuples
updated
5Ad-hoc approaches are complex and are not likely
to work
How to get the desired result using SQL92
(continues!)
DELETE FROM Temp T1 WHERE EXISTS (SELECT FROM
Temp AS T2 WHERE T1.Salary T2.Salary AND
((T1.VT_start gt T2.VT_Start) AND (T1.VT_end lt
T2.VT_end)) OR ((T1.VT_start gt T2.VT_Start)
AND (T1.VT_end lt T2.VT_end))
6Ad-hoc approaches are complex and are not likely
to work
Underlying semantic phenomenon
Projection on temporal relations involves
temporal coalescing about value equivalent
tuples
When it occurs (SQL)
Whenever a proper subset of the attributes of the
relations is chosen in the SELECT part of the
query
7Ad-hoc approaches are complex and are not likely
to work
How to get the desired result using a Temporal DB
(ex. TSQL2)
SELECT Salary FROM Employee WHERE Name
Andrea
8Ad-hoc approaches are complex and are not likely
to work
Example 2. Join (and temporal intersection)
Employee1
Name Salary VT_start VT_end
Andrea 60000 1/1/1993 30/5/1993
Andrea 70000 1/6/1993 31/12/1994
Employee2
Name Title VT_start VT_end
Andrea Ass. Provost 1/1/1993 30/9/1993
Andrea Provost 1/10/1993 31/1/1994
Andrea Professor 1/2/1994 31/12/1994
Query combined history of both Andreas salary
and title
9Ad-hoc approaches are complex and are not likely
to work
Intuitive SQL query SELECT Salary, Title,
Emp1.VT_start, Emp1.VT_end Emp2.VT_start,
Emp2.VT_end FROM Employee1, Employee2 WHERE
Employee1.NameAndrea AND
Employee1.NameAndrea
10Ad-hoc approaches are complex and are not likely
to work
Result obtained
Salary Emp1. VT_start Emp1. VT_end Title Emp2. VT_start Emp2. VT_end
60000 1/1/1993 30/5/1993 Ass. Provost 1/1/1993 30/9/1993
60000 1/1/1993 30/5/1993 Provost 1/10/1993 31/1/1994
60000 1/1/1993 30/5/1993 Professor 1/2/1994 31/12/1994
70000 1/6/1993 31/12/1994 Ass. Provost 1/1/1993 30/9/1993
70000 1/6/1993 31/12/1994 Provost 1/10/1993 31/1/1994
70000 1/6/1993 31/12/1994 Professor 1/2/1994 31/12/1994
11Ad-hoc approaches are complex and are not likely
to work
Result desired
Salary Title VT_start VT_end
60000 Ass. Provost 1/1/1993 30/5/1993
70000 Ass. Provost 1/6/1993 30/9/1993
70000 Provost 1/10/1993 31/1/1994
70000 Professor 1/2/1994 31/12/1994
12Ad-hoc approaches are complex and are not likely
to work
How to get the desired result using SQL92
SELECT Employee1.Name,Salary,Dept,Employee1.VT_sta
rt,Employee1.VT_end FROM Employee1,
Employee2 WHERE Employee1.NameEmployee2.Name AND
Employee2.VT_start lt Employee1.VT_start AND
Employee1.VT_end lt Employee2.VT_end UNION SELECT
Employee1.Name,Salary,Dept,Employee1.VT_start,Empl
oyee2.VT_end FROM Employee1, Employee2 WHERE
Employee1.NameEmployee2.Name AND
Employee1.VT_start gt Employee2.VT_start AND
Employee2.VT_end lt Employee1.VT_end AND
Employye1.VT_start lt Employee2.VT_end UNION SELECT
Employee1.Name,Salary,Dept,Employee2.VT_start,Emp
loyee1.VT_end FROM Employee1, Employee2 WHERE
Employee1.NameEmployee2.Name AND
Employee2.VT_start gt Employee2.VT_start AND
Employee1.VT_end lt Employee2.VT_end AND
Employye2.VT_start lt Employee1.VT_end UNION SELECT
Employee1.Name,Salary,Dept,Employee2.VT_start,Emp
loyee2.VT_end FROM Employee1, Employee2 WHERE
Employee1.NameEmployee2.Name AND
Employee2.VT_start gt Employee1.VT_start AND
Employee2.VT_end lt Employee1.VT_end
13Ad-hoc approaches are complex and are not likely
to work
Underlying semantic phenomenon
Join (Cartesian product) on temporal relations
involves temporal intersection
When it occurs (SQL)
Whenever more than one relation is used in the
FROM part of the query
Note the number of terms in the SQL union is 2n,
where n is the number of relations in the FROM
part
14Ad-hoc approaches are complex and are not likely
to work
How to get the desired result using a Temporal DB
(ex. TSQL2) SELECT Salary, Title FROM
Employee1, Employee2 WHERE Employee1.NameAndr
ea AND Employee1.NameAndrea
15Ad-hoc approaches are complex and are not likely
to work
- Until now, just two simple examples concerning
- SELECT a subset of attributes (? loop to do
colaescing) - FROM with gt1 relations (exponential union to do
intersection)
- And what about
- Union, difference, , nested queries
- Temporal predicates
- Primary\secondary keys
- Aggregate functions
- Integrity constraints
- Multiple (user-defined!) granularities
-
- arbitrary combinations of all the above issues
?
16Ad-hoc approaches are complex and are not likely
to work
- Key message
- Dealing with temporal data is a general problem
in DBs - Difficult problems (often hidden ones) have
to be faced
WHY? (informally) (1) Relational model assumes
independence of attribute values Time
attributes predicates about when the values of
the other attributes hold in the modeled reality
(VT) or are current in the DB (TT)
17Ad-hoc approaches are complex and are not likely
to work
WHY? (informally) (2) Relational model assumes
all data\knowledge is explicit The association
of time to tuples involves semantic
assumptions Such assumptions are part of our
commonsense way of looking at time. However,
they are NOT supported by the relational model
(as long as they are left implicit) Making the
semantic of time explicit is a complex task
(everywhere, not only in DBs)
18Ad-hoc approaches are complex and are not likely
to work
- From a Software Engeneering point of view
- Letting applications solve the problem in an
ad-hoc way is - Both cost and time expensive
- Likely to lead to errors in the applications
- Likely to make integration (shared data)
between applications impossible
19Ad-hoc approaches are complex and are not likely
to work
Temporal DB an area of research aiming at
providing once-and-forall principled and
integrated solution to the problem