Management of XML Documents in Object-Relational Databases - PowerPoint PPT Presentation

About This Presentation
Title:

Management of XML Documents in Object-Relational Databases

Description:

object-relational database technology good choice to represent complex objects ... ( Type_Professor( Jaeger , TypeVA_Subject ( CAD , CAE ), Computer Science )), 4 ) ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 29
Provided by: kudr
Category:

less

Transcript and Presenter's Notes

Title: Management of XML Documents in Object-Relational Databases


1
Management of XML Documents in Object-Relational
Databases
  • Thomas Kudrass Matthias Conrad
  • HTWK Leipzig

EDBT-WorkshopXML-Based Data ManagementPrague,
24 March 2002
2
Overview
  • Motivation
  • Object-Relational Database Concepts
  • Parsing XML Documents
  • XML-to-ORDB Mapping
  • Meta-Data
  • Special Issues
  • Conclusions

3
Motivation
  • Storing of XML documents in DBMS
  • Use existing database technology
  • Dealing with complex objects
  • XML documents complex objects
  • avoid any decomposition
  • object-relational database technology good choice
    to represent complex objects

4
User-Defined Types in ORDB
  • Complex Data Types
  • Object Type
  • Collection Type
  • Object References
  • Object Views

5
Example Object Types
CREATE TYPE Type_Professor AS OBJECT
( PName VARCHAR(80), Subject VARCHAR(120)
) object-valued object table
attribute CREATE TYPE Type_Course AS OBJECT (
CREATE TABLE TabProfessor OF Name
VARCHAR(100), Type_Professor Professor Type_Pr
ofessor )
6
Example Collection Types
CREATE TYPE Type_Professor AS OBJECT
( PName VARCHAR(80), Subject VARCHAR(120)
) Array Nested
Table CREATE TYPE TypeVa_ Professor AS
CREATE TYPE Type_TabProfessor AS
VARRAY(5) OF Type_Professor
TABLE OF Type_Professor   CREATE TABLE
TabDept ( DName VARCHAR(80), Professor
Type_TabProfessor ) NESTED TABLE
Professor STORE AS TabProfessor_List
7
Example Object References
CREATE TYPE Type_Professor AS OBJECT
( PName VARCHAR(80), Dept VARCHAR(120)
)   CREATE TABLE TabProfessor OF
Type_Professor   CREATE TYPE Type_Course AS
OBJECT ( Name VARCHAR(200), Prof_Ref REF
Type_Professor )   CREATE TABLE TabCourse OF
Type_Course
Reference to objects of object table TabProfessor
8
Parsing DTD and XML
XML Document
DTD
Well-Formedness Validity Check
Syntax Check
XML V2 Parser
DTD Parser
XML DOM Tree
DTD DOM Tree
Schema Definition
XML2 Oracle
JDBC / ODBC
DBMS Oracle
9
1 lt!ELEMENT University
(StudyCourse,Student)gt 2 lt!ELEMENT
Student (LName,FName,Course)gt 3
lt!ATTLIST Student StudNr CDATA REQUIREDgt 4
lt!ELEMENT Course (Name,Professor,CreditPts?)
gt 5 lt!ELEMENT Professor
(PName,Subject,Dept)gt 6 lt!ENTITY cs
Computer Sciencegt 7 lt!ELEMENT LName
(PCDATA)gt 8 lt!ELEMENT FName
(PCDATA)gt 9 lt!ELEMENT Name
(PCDATA)gt 10 lt!ELEMENT CreditPts
(PCDATA)gt 11 lt!ELEMENT PName
(PCDATA)gt 12 lt!ELEMENT Subject
(PCDATA)gt 13 lt!ELEMENT Dept
(PCDATA)gt 14 lt!ELEMENT StudyCourse
(PCDATA)gt
10
ObjectBasedMapping
  • Modification of the Mapping Algorithm Bourret
  • ? No class definitions
  • ? Use objects of the DTD tree

11
1 lt!ELEMENT University
(StudyCourse,Student)gt 2 lt!ELEMENT
Student (LName,FName,Course)gt 3
lt!ATTLIST Student StudNr CDATA REQUIREDgt 4
lt!ELEMENT Course (Name,Professor,CreditPts?
)gt 5 lt!ELEMENT Professor
(PName,Subject,Dept)gt 6 lt!ENTITY cs
Computer Sciencegt 7 lt!ELEMENT LName
(PCDATA)gt 8 lt!ELEMENT FName
(PCDATA)gt 9 lt!ELEMENT Name
(PCDATA)gt 10 lt!ELEMENT CreditPts
(PCDATA)gt 11 lt!ELEMENT PName
(PCDATA)gt 12 lt!ELEMENT Subject
(PCDATA)gt 13 lt!ELEMENT Dept
(PCDATA)gt 14 lt!ELEMENT StudyCourse
(PCDATA)gt
Step 1
  • Each Complex Element ? Table
  • Each Set-Valued Element ? Table
  • Primary Key in each Table

1 lt!ELEMENT University (StudyCourse,Student)gt CR
EATE TABLE TabUniversity ( IDUniversity   2
lt!ELEMENT Student (LName,FName,Course)gt CREATE
TABLE TabStudent ( IDStudent   4
lt!ELEMENT Course (Name,Professor,CreditPts?)gt
CREATE TABLE TabCourse ( IDCourse   5
lt!ELEMENT Professor (PName,Subject,Dept)gt
CREATE TABLE TabProfessor (
IDProfessor CREATE TABLE TabSubject
( IDSubject
12
1 lt!ELEMENT University
(StudyCourse,Student)gt 2 lt!ELEMENT
Student (LName,FName,Course)gt 3
lt!ATTLIST Student StudNr CDATA REQUIREDgt 4
lt!ELEMENT Course (Name,Professor,CreditPts?)
gt 5 lt!ELEMENT Professor
(PName,Subject,Dept)gt 6 lt!ENTITY cs
Computer Sciencegt 7 lt!ELEMENT LName
(PCDATA)gt 8 lt!ELEMENT FName
(PCDATA)gt 9 lt!ELEMENT Name
(PCDATA)gt 10 lt!ELEMENT CreditPts
(PCDATA)gt 11 lt!ELEMENT PName
(PCDATA)gt 12 lt!ELEMENT Subject
(PCDATA)gt 13 lt!ELEMENT Dept
(PCDATA)gt 14 lt!ELEMENT StudyCourse
(PCDATA)gt
Step 2
Other Elements Attributes ? Table Columns
CREATE TABLE TabCourse ( IDCourse, attrName, at
trCreditPts, CREATE TABLE TabProfessor
( IDProfessor, attrPName, attrDept, CREATE
TABLE TabSubject ( IDSubject, attrSubject,
CREATE TABLE TabUniversity ( IDUniversity, attrS
tudyCourse,   CREATE TABLE TabStudent
( IDStudent, attrStudNr, attrLName, attrFName,
  CREATE TABLE TblMatrikelNr ( IDMatrikelNr, at
trMNummer,
13
Step 3
Relationships between Elements ? Foreign Keys
CREATE TABLE TabUniversity ( IDUniversity
INTEGER NOT NULL, attrStudyCourse
VARCHAR(4000) NOT NULL, PRIMARY KEY
(IDUniversity))   CREATE TABLE TabStudent
( IDStudent INTEGER NOT
NULL, IDUniversity INTEGER NOT
NULL, attrStudNr VARCHAR(4000) NOT
NULL, attrLName VARCHAR(4000) NOT
NULL, attrFName VARCHAR(4000) NOT
NULL, PRIMARY KEY (IDStudent), CONSTRAINT
conMatrikel FOREIGN KEY (IDUniversity)
REFERENCES TabUniversity (IDUniversity))  ...
14
ORDBS Oracle and XML
  • Basic Idea
  • Generate an object-relational schema from the DTD
  • Natural representation of an XML document by
    combining user-defined types
  • Different Mapping Rules
  • Simple elements
  • Complex elements
  • Set-valued elements
  • Complex set-valued elements

15
XML Attributes Simple Elements
  • Elements of PCDATA type and XML attributes
  • ? Attributes of the object type
  • Domain of Simple Elements
  • No type information in the DTD
  • numeric vs. alphanumeric?
  • length?
  • Restrictions of the DBMS (e.g. VARCHAR Oracle
    4000 characters)
  • Mapping of an XML attribute of a simple element
  • ? Definition of an object type for both attribute
    and element

16
XML Attributes Simple Elements
lt!ELEMENT Professor (PName,Subject,Dept)gt lt!ATTLIS
T Professor PAddress CDATA REQUIREDgt lt!ELEMENT
PName (PCDATA)gt lt!ELEMENT Subject
(PCDATA)gt lt!ELEMENT Dept (PCDATA)gt lt!ATTLIST
Dept DAddress CDATA REQUIREDgt
CREATE TABLE TabProfessor OF
Type_Professor CREATE TYPE Type_Professor AS
OBJECT ( attr PAddress VARCHAR(4000),
attrPName VARCHAR(4000), attrSubject
VARCHAR(4000), attrDept Type_Dept) CREA
TE TYPE Type_Dept AS OBJECT ( attrDept
VARCHAR(4000), attrDAddress VARCHAR(4000))
 
17
Complex Elements
Nesting of elements by composite DB object types
CREATE TABLE TabUniversity ( attrStudyCourse
VARCHAR(4000), attrStudent Type_Matrikel
)   CREATE TYPE Type_Student AS OBJECT
( attrStudNr VARCHAR(4000), attrLName VARCHAR(40
00), attrFName VARCHAR(4000), attrCourse
Type_Vorlesung )   CREATE TYPE Type_Course AS
OBJECT ( attrName VARCHAR(4000), attrProfessor T
ype_Professor, attrCreditPts VARCHAR(4000))   C
REATE TYPE Type_Professor AS OBJECT
( attrPName VARCHAR(4000), attrSubject
VARCHAR(4000), attrDept VARCHAR(4000))
INSERT INTO TabUniversity VALUES ( Computer
Science' , Type_Student('23374','Conrad','Matthi
as', Type_Course(Databases II,
Type_Professor(Kudrass ,
Database Systems',

Computer Science), '4')))
SELECT u.attrStudent.attrLname FROM
TabUniversity u WHERE u.attrStudent.attrCourse.att
rProfessor.attrPName Kudrass'
18
Set-Valued Elements
  • Multiple Occurrence (in DTD) marked by or
  • DBMS Restrictions
  • collection type applicable to set-valued elements
    with text-valued subelements, e.g. ARRAY OF
    VARCHAR
  • collection type not applicable to set-valued
    elements with complex subelements
  • subelements may be set-valued again
  • Solutions
  • use newer DBMS releases (e.g. Oracle 9i)
  • model relationships with object references

19
Set-Valued Elements
lt!ELEMENT University (StudyCourse,Student)gt
Reference to University Objects
CREATE TYPE Type_Student AS OBJECT (
attrJahrgang VARCHAR(4000) ,
attrUniversity REF Type_University
)   CREATE TABLE TabStudent OF
Type_Student   CREATE TYPE Type_University AS
OBJECT( attrStudyCourse
VARCHAR(4000))   CREATE TABLE TabUniversity OF
Type_University
  • Set-valued element Student
  • Modeling in object type Type_Student with a
    reference
  • to objects of the table TabUniversity

20
Set-Valued Elements
CREATE TYPE TypeVA_Course AS VARRAY(100) OF
Type_Course CREATE TYPE TypeVA_Professor AS
VARRAY(100) OF Type_Professor CREATE TYPE
TypeVA_Subject AS VARRAY(100) OF
VARCHAR(4000) CREATE TABLE TabUniversity
( attrStudyCourse VARCHAR(4000), attrStudent
Type_Matrikel )   CREATE TYPE Type_Student AS
OBJECT ( attrStudNr VARCHAR(4000), attrLName VAR
CHAR(4000), attrFName VARCHAR(4000), attrCourse
Type_Vorlesung )   CREATE TYPE Type_Course AS
OBJECT ( attrName VARCHAR(4000), attrProfessor T
ype_Professor, attrCreditPts VARCHAR(4000))   C
REATE TYPE Type_Professor AS OBJECT
( attrPName VARCHAR(4000), attrSubject
VARCHAR(4000), attrDept VARCHAR(4000))
21
Set-Valued ElementsExample
INSERT INTO TabUniversity VALUES ( Computer
Science' , TypeVA_Student (
Type_Student('23374','Conrad','Matthias',
TypeVA_Course ( Type_Course(Databas
es II, TypeVA_Professor (
Type_Professor(Kudrass ,
TypeVA_Subject ( Database
Systems,Operating Systems), Computer
Science)),4), Type_Course(CAD
Intro, TypeVA_Professor (
Type_Professor(Jaeger ,
TypeVA_Subject ( CAD,CAE),
Computer Science)),4),
...)), Type_Student(00011',Meier',Ralf', )
) ...)
22
Dealing with Null Values
  • Restrictions with NOT NULL constraints in
    object-relational DB schema
  • NOT NULL constraints in table - not in object
    type!
  • NOT NULL constraints not applicable to collection
    types
  • Object-valued attributes
  • use CHECK constraints for NOT NULL
  • Loss of DTD semantics DTD in the database

23
Dealing with CHECK Constraints
lt!ELEMENT Course (CName, Address?)gt lt!ELEMENT
Addresse (Street, City?)gt       CREATE TYPE
Type_Address AS OBJECT ( attrStreet VARCHAR(4000)
, attrCity VARCHAR(4000))   CREATE TYPE
Type_Course AS OBJECT ( attrName VARCHAR(4000),
attrAddress Type_Address)   CREATE TABLE
TabCourse OF Type_Course ( attrName NOT
NULL, CHECK (attrAdresse.attrStrasse IS NOT
NULL))
 // ORA-02290 Desired error message 1.
INSERT INTO TabCourse ( VALUES (CAD
Intro,Type_Address
(NULL,Leipzig)   // ORA-02290
Undesired error message 2. INSERT INTO
TabCourse ( VALUES ('RN', NULL)

24
Meta-Data about XML Documents
  • Unique DocumentID for each Document
  • Prolog Information
  • Document Location (URL)
  • Name Space
  • Element vs. Attribute

25
Naming Conventions for DB Objects
  • Rules
  • TabElementname ? Table Name
  • Type_Elementname ? Object Type Name
  • TypeVa_Elementname ? Array Name
  • No Conflicts with Keywords
  • Introduction of a Schema ID
  • Naming Rule
  • SchemaID Naming Convention Name

CREATE TYPE DTD01_Type_University
CREATE TYPE DTD02_Type_University AS OBJECT (
AS OBJECT (
attrStudyCourse VARCHAR(4000) )
attrRegister VARCHAR(4000) )
26
Conclusions Advantages
  • Non-atomic domains possible
  • Natural representation of XML Documents
  • Nesting of any complexity possible
  • Simple queries by using dot notation
  • Using object references to represent
    relationships (OIDs)

27
Conclusions Drawbacks
  • Mapping Deficiencies
  • Possible restrictions of element types in
    collections
  • No adequate mapping of NOT NULL constraints
  • Loss of Information
  • Prolog, Comments, Processing Instructions, Prolog
  • Entity References
  • Attribute vs. Element ?
  • Schema Evolution
  • Modification of DTD ? Modification of DB
  • Type Information
  • Target type VARCHAR - not sufficient!

28
Outlook
  • Graph-based creation of a schema
  • Source XML Schema
  • Use CLOB datatype
  • Enhance Meta-Schema
  • Comments, Processing Instructions and their
    position in document
  • Entity references and their substitution text
Write a Comment
User Comments (0)
About PowerShow.com