CS240A:%20Databases%20and%20Knowledge%20Bases%20Temporal%20Applications%20and%20SQL:1999 - PowerPoint PPT Presentation

About This Presentation
Title:

CS240A:%20Databases%20and%20Knowledge%20Bases%20Temporal%20Applications%20and%20SQL:1999

Description:

The physical level: efficient storage and indexing ... Financial: Stock market data ... Scientific: Timestamping satellite images. Dating archeological finds ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 31
Provided by: Fushen6
Learn more at: http://web.cs.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: CS240A:%20Databases%20and%20Knowledge%20Bases%20Temporal%20Applications%20and%20SQL:1999


1
CS240A Databases and Knowledge BasesTemporal
Applications and SQL1999
  • Carlo Zaniolo
  • Department of Computer Science
  • University of California, Los Angeles
  • February 2003

2
Temporal Databases Overview
  • Many applications
  • The problem is harder than what you think
  • Support for time in SQL the good and the bad
  • A time ontology
  • Many approaches proposed
  • TSQL2
  • The physical level efficient storage and
    indexing techniques.

3
An Introduction to Temporal Databases
  • Applications abound
  • A case study using SQL Queries on timevarying
    data are hard to express in SQL.
  • Temporal databases provide builtin support for
    storing and querying time-varying information.

4
Applications Abound Examples
  • Academic Transcripts record courses taken in
    previous and the current semester or term and
    grades for previous courses
  • Accounting What bills were sent out and when,
    what payments were received and when?
  • Delinquent accounts, cash flow over time
  • Moneymanagement software such as Quickencan show
    e.g., account balance over time.
  • Budgets Previous and projected budgets,
    multiquarter or multiyear budgets

5
Temporal DB Applications (cont.)
  • Data Warehousing Historical trend analysis for
    decision support
  • Financial Stock market data
  • Audit why were financial decisions made, and
    with what information available?
  • GIS Geographic Information Systems ()
  • Land use over time boundary of parcels
    changeover time, as parcels get partitioned and
    merged.
  • Title searches
  • Insurance Which policy was in effect at each
    point in time, and what time periods did that
    policy cover?

6
Temporal DB Applications (cont.)
  • Medical records Patient records, drug regimes,
    lab tests.Tracking course of disease
  • Payroll Past employees, employee salary
    history, salaries for future months, records of
    withholdingrequested by employees
  • Capacity planning for roads and utilities.
    Configuring new routes, ensuring high utilization
  • Project scheduling Milestones, task assignments
  • Reservation systems airlines, hotels, trains.
  • Scientific Timestamping satellite images. Dating
    archeological finds

7
Temporal DBs Applications Conclusion
  • It is difficult to identify applications that do
    not involve the management of temporal data.
  • These applications would benefit from builtin
    temporal support in the DBMS. Main benefits
  • More efficient application development
  • Potential increase in performance

8
Reviewing the Situation
  • The importance of temporal applications has
    motivated much research on temporal DBs but no
    satisfactory solution has been found yet
  • SQL3 does not support temporal queries
  • Temporal DBs remain an open research problem.
  • The problem is much more difficult than it
    appears at first we have become so familiar with
    the time domain that we tend to overlook its
    intrinsic complexity.
  • Some of the solutions proposed by researchers
    lack ease of use and amenability to efficient
    implementation

9
Case Study
  • University of Arizona's Office of Appointed
    Personnel has some information in a database.
  • Employee(Name, Salary, Title)
  • Finding an employee's salary is easy.
  • The OAP wishes to add the date of birth
  • Employee(Name, Salary, Title, DateofBirth
    DATE)
  • SELECT Salary, DateofBirth FROM
    Employee WHERE Name 'Bob'

10
Converting to a Temporal Database
  • Now the OAP wishes to computerize the employment
    history.
  • Adding validity periods to tuples
  • Employee (Name, Salary, Title, DateofBirth,
    Start DATE, Stop DATE)

11
Converting to a Temporal Database Example
  • Employee (Name, Salary, Title, DateofBirth,Start
    DATE, Stop DATE)

12
Extracting the Salary
  • Find the employee's salary at a given time e.g.
    the current one
  • SELECT SalaryFROM EmployeeWHERE Name
    'BobAND Start lt CURRENT_TIMESTAMPAND
    CURRENT_TIMESTAMP lt Stop
  • Instead of CURRENT_TIMESTAMP we could have given
    any time stamp or date

13
Distributing the Salary History
  • OAP wants to distribute to all employees their
    salary history.
  • Output For each person, maximal intervals at
    each salary
  • Employee could have arbitrarily many title
    changes between salary changes.
  • Name Salary Start Stop
  • Bob 60000 19930101 19930601Bob 70000
    19930601 19950101

14
Extracting the Salary, cont.
  • Alternative 1 Give the user a printout of Salary
    and Title information, and have user determine
    when his/her salary changed.
  • Alternative 2 Use SQL as much as possible. Find
    those intervals that overlap or are adjacent and
    thus should be merged.

15
Bobs Salary History in SQL
  • CREATE TABLE Temp(Salary, Start, Stop)AS SELECT
    Salary, Start, StopFROM Employee WHERE Name
    'Bob'
  • repeatUPDATE Temp AS T1 SET (T1.Stop)
    (SELECT MAX(T2.Stop) FROM Temp AS T2 WHERE
    T1.Salary T2.Salary AND T1.Start lt
    T2.Start AND T1.Stop gt T2.Start AND
    T1.Stop lt T2.Stop) WHERE EXISTS (SELECT
    FROM Temp AS T2 WHERE T1.Salary
    T2.Salary AND T1.Start lt T2.Start AND
    T1.Stop gt T2.Start AND T1.Stop lt T2.Stop)
  • until no tuples updated

16
Example
  • Initial table
  • After one pass
  • After two passes

17
Salary History (cont.)
  • Intervals that are not maximal must be deleted
  • DELETE FROM Temp T1WHERE EXISTS (SELECT
    FROM Temp AS T2 WHERE T1.Salary T2.Salary
    AND ( (T1.Start gt T2.Start AND T1.Stop lt
    T2.Stop OR (T1.Start gt T2.Start AND
    T1.Stop lt T2.Stop) )
  • The loop is executed lgN times in the worst case,
    where N is the number of tuples in a chain of
    overlapping or adjacent, valueequivalent tuples.
    Then delete extraneous, nonmaximal intervals.

18
Alternative 3 Entirely in SQL
  • CREATE TABLE Temp(Salary, Start, Stop) AS
    SELECT Salary, Start, Stop FROM Employee
    WHERE Name 'Bob'
  • SELECT DISTINCT F.Salary, F.Start, L.StopFROM
    Temp AS F, Temp AS LWHERE F.Start lt L.StopAND
    F.Salary L.SalaryAND NOT EXISTS (SELECT
    FROM Temp AS M WHERE M.Salary F.Salary AND
    F.Start lt M.Start AND M.Start lt L.Stop AND NOT
    EXISTS (SELECT FROM Temp AS T1 WHERE
    T1.Salary F.Salary AND T1.Start lt M.Start AND
    M.Start lt T1.Stop))AND NOT EXISTS (SELECT
    FROM Temp AS T2 WHERE T2.Salary F.Salary
    AND
  • ( (T2.Start lt F.Start AND F.Start lt T2.Stop)
    OR (T2.Start lt L.Stop AND L.Stop lt T2.Stop)))

19
Alternative 4 Using More Procedural Code
  • Use SQL only to open a cursor on the table
  • Maintain a linked list of intervals, each with a
    salary
  • Initialize this linked list to empty
  • DECLARE emp_cursor CURSOR FORSELECT Salary,
    Start, StopFROM EmployeeOPEN emp_cursorloop
  • FETCH emp_cursor INTO salary,start,stop if
    no data returned then go to finishedfind
    position in linked list to insert this
    information
  • go to loopfinishedCLOSE emp_cursoriterate
    through linked list, printing out dates and
    salaries

20
A More Drastic Alternatives Reorganize the schema
  • Separate Salary, Title, and DateofBirth
    information
  • Employee1 (Name, Salary, Start DATE, Stop DATE)
  • Employee2 (Name, Title, Start DATE, S top
    DATE)
  • Getting the salary information is now easy
  • SELECT Salary, Start, Stop FROM
    Employee1 WHERE Name 'Bob
  • But what if we want a table with both salary and
    title?

21
Temporal Joins
Employee1
Employee2
Their Temporal Join
22
Temporal Join in SQL
  • SELECT E1.Name, Salary, Title, E1.Start, E1.Stop
  • FROM Employee1 AS E1, Employee2 AS E2
  • WHERE E1.NameE2.Name AND E2.Start lt E1.Start
    AND E1.Stop lt E2.Stop
  • UNION ALL
  • SELECT E1.Name, Salary, Title,E1.Start, E2.Stop
  • FROM Employee1 AS E1, Employee2 AS E2
  • WHERE E1.Name E2.Name AND E1.Start gt E2.Start
    AND E2.Stoplt E1.Stop AND E1.Start lt E2.Stop
  • UNION ALL
  • SELECT E1.Name, Salary, Title E2.Start, E1.St
  • FROM Employee1 AS E1, Employee2 AS E2
  • WHERE E1.Name E2.Name AND E2.Start gt
    E1.Start AND E1.Stop lt E2.Stop AND E2.Start lt
    E1.Stop
  • UNION ALL
  • SELECT E1.Name, Salary, Title E2.Start,
    E2.Stop FROM Employee1 AS E1,
    Employee2 AS E2
  • WHERE E1.Name E2 Name AND E2.Start gt
    E1.Start AND E2.Stop lt E1.Stop AND NOT
    (E1.Start E2.Start AND E1.Stop E2.Stop)

23
Extracting the Salary History in TSQL2
  • SELECT Salary
  • FROM Employee
  • WHERE Name 'Bob
  • There is no explicit mention of time in the
    query. By default the system returns the
    coalesced time history

24
Temporal Joins in TSQL2
  • SELECT E1.Name, Salary, Title
  • FROM Employee1 AS E1, Employee2 AS E2
  • WHERE E1.Name E2.Name

25
Summary
  • Coalescing and temporal joins are very difficult
    to express in SQL.
  • Solutions proposed
  • Special operators for period-based representation
  • TSQL2 avoid explicit operations on periods
    (implicit model)
  • Point-Based Representation
  • Time stamp attributes rather than tuples
    (difficult on tables but not on structured XML
    documents)
  • Others,including combinations of above (more
    than 40 counted)

26
Operators on Periods
  • A new aggregate called coalesce
  • An overlap operator for joins
  • SELECT E1.Name, Salary, Title
  • FROM Employee1 AS E1, Employee2 AS E2
  • WHERE E1.Name E2.Name AND
  • overlap(E1.Start,E1.End, E2.Start, E2.End)
  • Definition of overlap

27
Allen Operators on Intervals
  • Overlap,
  • Contains,
  • Meets,
  • Precedes, follows.
  • Contains is also applicable to sets of intervals.

28
Point-Based Model
  • Employee1 (Name, Sal, Day )
  • Bob 6000
    19930101
  • Bob 6000
    19930531 Bob 7000
    19930531
  • Bob 7000
    19941231
  • Internally we might still use the period-based
    representation for point-based and TSQL2

29
Queries in Point Based
  • No coalescing needed in the query e.g., project
    out salary
  • SELECT E1.Name, E1.Day
  • FROM Employee1 AS E1
  • Temporal Joins are simple
  • SELECT E1.Name, Sal, Title
  • FROM Employee1 AS E1, Employee2 AS E2
  • WHERE E1.Name E2.Name AND E1.DayE2.Day

30
Conclusions
  • Several alternatives, in terms of data model and
    SQL extensions to be used,
  • Internal representation often must be divorced
    from external oneadding to alternatives and
    complexity
  • New temporal clustering and indexing schemes
    should be used for maximum performance.
Write a Comment
User Comments (0)
About PowerShow.com