Title: Temporal Information and XML
1Temporal Informationand XML
- Carlo Zaniolo
- Department of Computer Science
- University of California, Los Angeles
2A Short History of Timein Databases
- Relational model between 33 and 48 temporal DB
proposals counted - A struggle to get around the limitations of
relational (flat) tables and a rigid query
language (SQL) - A key issue Temporal interval coalescing is
needed after each projection! - Clustering, indexing, query optimization for
temporal information add to the complexity
3Coalescing
- Time stamping the individual tuplesIf we
want the salary history, we have to coalesce the
last three tuples into one
4XML
- XML hierarchical views with temporal groups
- Temporal grouped models are more natural and
powerful, but they did not fit in the flat
relational model - XML Query languages can easily express temporal
queries on these views.
5History Tables
- Time-stamped tuples in relations
- Temporally grouped time-stamped attribute values
6Historical XML Database ArchitectureTwo
Approaches
- Native XML databases
- Historical data are stored in native XML database
- XML queries can be specified directly upon the
database - Native XML databasesTamino(Software AG),
eXcelon(XIS) - XML-enabled RDBMS
- Historical view decomposed into relational
databases as binary tables - Historical data can then be published as XML
document through SQL/XML publishing functions or
queried through a middleware as XML views
7Historical XML Views Architecture
SQL Queries
Relational Data Current Content
Temporal Queries
XML VIEWS
Historical Data
Historical Database
8Relational Storage of Temporal Relational Data
- Relational schema
- employee(empno, name, sal, title, deptno)
- Attribute history tables employee_sal
(empno, sal, tstart, tend) - employee_title(empno, title, tstart, tend)
-
- An internal relation for each time-varying
attribute - XQuery statements on the XML views translated
into SQL statements on the internal relations
9Experiments
- Simulated data with history of 300,024 employees
- Comparing Native XML DBs
- SoftwareAGs Tamino (text-based storage). XPath
- eXcelons XIS (XML Information Server)
(OODBMS-based storage). XQuery - Against DB2.
10Preliminary Performance Comparisons
Storage Size
11Performance Comparisons (contd)
Query Performance of DB2 and Tamino
Q2 history query Q4,Q6 snapshot queries Q3,Q5
interval queries Q1 scan of databases Q7 join
12Performance Comparisons (contd)
13Related Problems
- Query Performance
- Indexing R trees
- Temporal clustering tuples from the same time
period should be assigned to same page - Page Usefulness method. A page with employee
records for a department. After 60 quit that
page is only 40 useful. - Compression should not be ruled out
- sparingly used in DBs, but important for XML
- DB2 mainframes, Oracle
- Updates not a problem for histories.
14Research (cont.)
- XML Query languages are powerful and temporal
queries can be expressed in XQuery without any
extension, but not for all users - User-friendly QBE-like language for temporally
grouped model - SQLXML temporal views and queries
- ROLLUPS-like temporal views (and SQL1999)
- Different viewsbut the same RDBMS-based
implementation underneath.
15XML Representation of DB HistoryTable Columns as
XML Elements
ltemployees tstart"1995-01-01"
tend"1996-12-31"gt ltemployee tstart"1995-01-01"
tend"1996-12-31"gt ltempno tstart"1995-01-01"
tend"1996-12-31"gt10003lt/empnogt ltname
tstart"1995-01-01" tend"1996-12-31"gtBoblt/namegt
ltsalary tstart"1995-01-01" tend"1995-05-31"gt60
000lt/salarygt ltsalary tstart"1995-06-01"
tend"1996-12-31"gt70000lt/salarygt lttitle
tstart"1995-01-01" tend"1995-09-30"gtEngineerlt/ti
tlegt lttitle tstart"1995-10-01"
tend"1996-01-31"gtSr Engineerlt/titlegt lttitle
tstart"1996-02-01" tend"1996-12-31"gtTech
Leaderlt/titlegt ltdept tstart"1995-01-01"
tend"1995-09-30"gtQAlt/deptgt ltdept
tstart"1995-10-01" tend"1996-12-31"gtRDlt/deptgt
ltDOB tstart"1995-01-01" tend"1996-12-31"gt1945-0
4-09lt/DOBgt lt/employeegt lt!-- More
--gt lt/employeesgt
16Thank you!
17References
- S. Sarawagi, S. Thomas,R. Agrawal Integrating
Association Rule Mining with Relational Database
Systems Alternatives and Implications,SIGMOD
1998 - Fusheng Wang, Carlo Zaniolo Publishing and
Querying the Histories of Archived Relational
Databases in XML. 4thInternational Conference on
Web Information Systems Engineering, December
10th - 12th, 2003 Roma, Italy. - Haixun Wang, Carlo Zaniolo, Chang Richard Luo
ATLaS a Small but Complete SQL Extension for
Data Mining and Data Streams. VLDB 2003--Demo. - Haixun Wang and Carlo Zaniolo ATLaS A Native
Extension of SQL for Data Mining. SIAM
International Conference on Data Mining 2003, San
Francisco, CA, May 1-3, 2003 - Reza Sadri, Carlo Zaniolo, Amir M. Zarkesh, Jafar
Adibi A Sequential Pattern Query Language for
Supporting Instant Data Minining for e-Services,
VLDB 2001. - Haixun Wang, Carlo Zaniolo Using SQL to Build
New Aggregates and Extenders for Object-
Relational Systems. VLDB 2000.