Title: XML DATABASES
1XML DATABASES
2Acknowledgement
- Some of the work is taken from various members
of EXEL Research Group at Faculty of IT, UTS
(http//exel.it.uts.edu.au/) - Mr Rajugan Rajagopalapillai
- Prof. Tharam S. Dillon
- Dr. Wenny Rahayu (La Trobe University)
- Mr. Torab Torabi (La Trobe University)
3WEB Databases
- There are two most important framework where web
databases are needed - Intensive Data-driven Web Applications - eg.
student enrolment system, student timetable
allocation, etc.). - Intensive Document-driven Web Applications eg.
on-line business transactions and contracts. - In this lecture, we will go through basic Web DB
architecture, and example of three common types
of Web DB application - Web Content Management
- E-Commerce Database Management
- XML Database Management
4WEB Databases - Architectures
- Architecture 1
- Web Browser -gt HTML pages -gt Script Files
Embedded SQL -gt middleware eg. JDBC, ODBC -gt
Database - Architecture 2
- Web Browser -gt HTML pages calls to C or Java
routine or Active X controls -gt compiled modules
-gt middleware -gtDatabase - Architecture 3
- Web Browser -gt Application Server (Oracle Forms
and Reports, ASP, Scripts, middleware) -gt Database
5WEB Databases - Architectures (ctd.)
- The three different architectures mentioned in
the previous slide are generic categories. We can
easily combine a number of different
architectures to set up our own web-database
systems. - These days, we can see more of Architecture type
3 made available by software/database industry
(eg. Oracle 8i/9i, Cold Fusion, etc.). This type
of architecture hides a lot of the complexity of
web-database set up into the application layer.
6Example Application 1 Web Content Management
using Oracle WebDB - Content Management
HTTP Client(s) Developer(s) Administrator(s)
Application Server (Remote Server) Web Server
(OAS, Apache) WebDB Listener
Internet
Internet
HTTP Client(s)
HTTP Client(s)
DB Server
Slide generated by R.Rajugan, Oracle Web DB,
2000
7Web Content Management
- The concept of consolidating web contents into a
repository or a database system. - This new concept of using databases to power web
sites is gaining popularity due to many benefits
it provides in managing unstructured data under a
structured framework. - For example, Oracle cut 1 billion in costs by
consolidating contents from more than 40 regional
Web sites - In Web Content Management, a web page is divided
into several sections (called web-assets). These
web-assets may include page header, page footer,
menu items, icons/images, and block contents. - Each web asset will have its own set of
attributes, eg. Colours, size, etc. - Web assets are stored in a DBMS following a
structural design.
8Web Content Management (example from Managing
Web Content from File system to Database, Oracle,
2000)
9Example Application 2 A three-tier Web and
E-Commerce architecture
Incoming internet request
Web servers in web server layer
E-Commerce servers in Mid-Tier layer
Catalog database
Application server in Back-end layer
ERP or Mainframe with inventory data
10A three-tier Web and E-Commerce architecture (ctd)
- Web server layer web servers route we requests
to e-commerce servers and assemble returned data
into web pages for clients. - Mid-tier layer e-commerce servers do the hard
work of authenticating users, generating
transactions, and retrieving data from back-end
systems. - Catalog databases hold specifications of
products that can be ordered. - Back-end layer ERP (Enterprise Resource
Planning) such as SAP or Mainframe applications
hold back-end inventory database.
11Database Driven E-Catalog (based on Sherif
Danish, ACM Sigmod Record Special Section on
E-Com 274 December 1998)
- Creation of a product information database in the
enterprise (see catalog database in middle-tier
layer from previous slides), and using this
database as a foundation for deploying an
electronic catalog. - Motivation we cant simply use memo or BLOB type
to store unstructered data for a catalog. Hence,
a new method of storing catalog is needed.
Structured Catalog Database
Paper catalog
12Database Driven E-Catalog - ctd. (based on Sherif
Danish, ACM Sigmod Record Special Section on
E-Com 274 December 1998)
- Implementation issues
- Generic Model contains attributes in the form of
name/value pairs. This form allows storage of
attributes for any product type in the same
table.
13Database Driven E-Catalog - ctd. (based on Sherif
Danish, ACM Sigmod Record Special Section on
E-Com 274 December 1998)
- Implementation issues
- Multiple Language attributes are stored in coded
name/value pairs, attribute name codes are then
represented in different languages in a separate
table. Attribute value codes are translated in a
similar table.
14Current Research on Database Support for
E-Commerce (based on Martin Birchler et al, ACM
Sigmod Record Special Section on E-Com 274
December 1998)
- New Roles of DB Management Systems traditional
database applications vs. component-based
software engineering - XML based document databases finding the best
solution for storing XML documents - Database support for large scale repositories of
business process components such as a data
warehouse issues of data warehouse for
documents.
15Example Application 3 XML Database Management
- XML (eXtensible Markup Language) is a document
description meta language that is used to
represent large-scale data and documents in the
World Wide Web - XML enables information exchange and
interoperability between applications - In the last few years researchers have
investigated the best database storage for XML
Document Management
16XML Background and Overview
- XML is designed to
- separate syntax from semantics to provide a
common framework for structuring information - allow tailor-made markup for any imaginable
application domain
17Current XML Technologies
XML Background and Overview
- XML Schema is used to define the structure of the
XML documents - XSL (eXstensible Stylesheet Language) and CSS
(Cascading Style Sheet) are used to display XML
documents - XSLT is used to transform the XML document into
XHTML (or HTML) or plain text, including
automatic construction of index, references, etc.
XSLT is a subset of XSL. It is mainly used to
perform data extraction, sorting and
restructuring of XML documents
18Current XML Technologies
XML Background and Overview
- XPath is a querying language that allows
retrieval of specified parts of an XML document
(e.g. all article titles with the abstract and
section titles). - XQuery will be used to express queries on XML
documents (SQL for XML).
19XML Document Example
XML Background and Overview
- lt?xml version1.0 encodingISO-8859-1?
standaloneyes - lt?xml-stylesheet typetext/css
hrefgreeting.css?gt - ltdocument_taggt
- ltsection_taggtsectioncontentlt/section_taggt
- ltsubsection_taggtsubsection content
- lt/subsection_taggt
- lt/document_taggt
- lt!--this is a template XML Document-gt
EPILOG Comment
20XML Elements Example
XML Background and Overview
- ltbookgt
- lttitlegtMy First XMLlt/titlegt
- ltprod id"33-657" media"paper"gtlt/prodgt
- ltchaptergtIntroduction to XML
- ltparagtWhat is HTMLlt/paragt
- ltparagtWhat is XMLlt/paragt lt/chaptergt
- ltchaptergtXML Syntax
- ltparagtElements must have a closing taglt/paragt
- ltparagtElements must be properly nestedlt/paragt
- lt/chaptergt
- lt/bookgt
Elements
21XML Schema Basics
XML Background and Overview
- Schema refers to the structure of data
- Builds a data model
- Inherently defines structure and relationships
between data - Readable by both people and machines
22Why Schemas Are Important?
XML Background and Overview
- Provide standard structures for data exchange
- Provide constraints for testing document validity
- Help manage integration of documents and data
- Framework for both document and data structures
- Help separate parts of documents for intelligent
search/manipulation
23XML Document Example personxml.xml
XML Background and Overview
- lt?xml version"1.0" encoding"UTF-8"?gt
- lt?xml-stylesheet type"text/xsl"
href"personxsl.xsl"?gt - ltPersonInfo xmlnsxs"http//www.w3.org/2001/XMLSc
hema" xsschemaLocation"personxmlschema.xsd"gt - ltPersongt
- ltnamegtLee Cottrelllt/namegt
- ltaddressgt123 Herelt/addressgt
- ltcitygtMelbournelt/citygt
- ltgendergtFlt/gendergt
- lt/Persongt
- ltPersongt
- ltnamegtDavid Schmidtlt/namegt
- ltaddressgt707 Therelt/addressgt
- ltcitygtSydneylt/citygt
- ltgendergtMlt/gendergt
- lt/Persongt
- lt/PersonInfogt
24XML Schema Example personxmlschema.xsd
XML Background and Overview
- ltxsdschema xmlnsxsd"http//www.w3.org/2001/XMLS
chema" version"1.0"gt - ltxsdelement name"PersonInfo"gt
- ltxsdcomplexTypegt
- ltxsdelement name"Person"gt
- ltxsdcomplexTypegt
- ltxsdsequencegt
- ltxsdelement name"name"
type"xsdstring" maxOccurs"1" minOccurs"1" /gt - ltxsdelement name"address"
type"xsdstring" maxOccurs"unbounded"
minOccurs"0" /gt - ltxsdelement name"city"
type"xsdstring" maxOccurs"unbounded"
minOccurs"0" /gt - ltxsdsimpleType name"gender"gt
25XML Schema Example (cont)
XML Background and Overview
- ltxsdrestriction
base"xsdstring"gt - ltxsdenumeration value"M" /gt
- ltxsdenumeration value"F" /gt
- lt/xsdrestrictiongt
- lt/xsdsimpleTypegt
- lt/xsdsequencegt
- lt/xsdcomplexTypegt
- lt/xsdelementgt
- lt/xsdcomplexTypegt
- lt/xsdelementgt
- lt/xsdschemagt
26XSL Example personxsl.xsl
XML Background and Overview
- ltxslstylesheet version"1.0" xmlnsxsl"http//ww
w.w3.org/TR/WD-xsl"gt - ltxsltemplate match"/"gt
- lthtmlgt
- ltheadgt
- lttitlegtThis is the person pagelt/titlegt
- lt/headgt
- ltbodygt
- lth1gtPerson Databaselt/h1gt
- ltpgtThis simply displays the data on the page with
some formattinglt/pgt - lttable border"1"gt
- lttrgt
- ltthgtNamelt/thgt
- ltth colspan"2"gtAddresseslt/thgt
27XSL Example (cont)
XML Background and Overview
- ltth colspan"2"gtCitylt/thgt
- ltthgtGenderlt/thgt
- lt/trgt
- ltxslfor-each select"PersonInfo/Person"gt
- lttrgt
- lttdgtltbgtltigtltxslvalue-of select"name"/gtlt/igtlt/bgtlt/
tdgt - lttdgt
- ltxslfor-each select"address"gt
- lttdgt
- ltxslvalue-of select"."/gt
- lt/tdgt
- lt/xslfor-eachgt
- lt/tdgt
28XSL Example (cont)
XML Background and Overview
- lttdgt
- ltxslfor-each select"city"gt
- lttdgt
- ltxslvalue-of select"."/gt
- lt/tdgt
- lt/xslfor-eachgt
- lt/tdgt
- lttdgtltbgtltigtltxslvalue-of select"gender"/gtlt/igtlt/bgt
lt/tdgt - lt/trgt
- lt/xslfor-eachgt
- lt/tablegt
- lt/bodygt
- lt/htmlgt
- lt/xsltemplategt
- lt/xslstylesheetgt
29The output (XML-XMLSchema-XSL)
XML Background and Overview
30Different Storage for XML
- There are several options for XML repository,
from file system to the Native XML database - There are also products that utilize their mature
DBMS technology with some adjustment to
facilitate XML. This product can be categorized
as XML-Enabled database, eg. Oracle9i
31XML Relational Data in Oracle
- Oracle Database is originally a pure RDBMS. Since
Oracle8i, the database has shifted to become an
ORDBMS. The generation of Oracle9i and above aims
to become an XML-enabled DBMS - To achieve full integration of XML and
(object-)relational data, two aspects of
functionality need to be addressed by Oracle - From a data storage perspective, both XML and
relational data should be supported in the same
database - From a query language perspective, both SQL-based
and XML-based query should be available
32Storing XML in Oracle9i
- Oracle strategy for XML storage will depend on
the nature of the XML document - For data-centric, the XML documents can be stored
in one or more tables - For document-centric, Oracle original solution is
by storing the document in unstructured LOBs
(Large Objects) - A newer approach is by introducing XMLType from
Oracle 9i onward
33Storing XML in Oracle9i - 2
34Processing XML Document Oracle Storage
- Regardless of how Oracle stores the XML document,
they have some basic components below
XML Documents
Oracle Text Cartridge
Oracle XML Parser
Oracle Database
Oracle XML SQL Utility
Oracle XPath Engine
XML Infoset
Oracle XSLT Processor
35Data-Centric in Relational Tables
- Oracle stores the data-centric XML elements into
Object-Relational (OR) tables and columns,
including as the collection, nested table, object
types, and other Oracle OR features - Once stored in the OR form, the data can be
easily updated, queried, rearranged and
reformatted as needed using SQL - The capability for viewing XML data as relational
data is useful to maximize the applicability of
relational technology and tools for data mining,
analysis, report generation, etc
36XML SQL Utility (XSU)
- The mapping process is enabled by Oracle XML SQL
Utility (XSU) - XSU
- accepts the application queries, passing them to
the database and returning the resulting data in
an XML format corresponding to the database
schema of the query - accepts an XML Document conformant to the
database schema and save the data untagged in the
database across this schema - creates the DTD/XML Schema that represents the
queried database schema - XSU installation requires the following
components - Database connectivity JDBC driver
- XML Parser Oracle XML Parser version2 (comes
with Oracle 8i and 9i) - XSU comes with a simple command line front end
which gives you quick access to XML generation
and insertion.
37Mapping XML Data and OR Features
- This option maps an XML Document to the
underlying OR storage and conversely provides the
capability to retrieve the OR data as an XML
document - The basis for both direction is a schematic
mapping from relational structure to XML. - The whole query results is enclosed by
ltROWSETgtlt/ROWSETgt - Each tuple of the result is put in ltROWgt lt/ROWgt
- Attribute values turn into tags ltNamegtlt/Namegt
- Object types becomes inner tags according to the
internal structure - Collection has it name as the tag to encloses the
collection value
38Mapping XML Data and OR Features -2
- Sample Object-Relational tables
- CREATE TYPE AddressType AS OBJECT
- ( STREET VARCHAR2(20),
- CITY VARCHAR2(20),
- STATE CHAR(2),
- ZIP VARCHAR2(10) ) /
- CREATE TYPE EmployeeType AS OBJECT
- ( EMPNO NUMBER,
- ENAME VARCHAR2(20),
- SALARY NUMBER,
- EMPADDR AddressType
- )
- /
- CREATE TYPE EmployeeListType AS TABLE OF
EmployeeType - /
39Mapping XML Data and OR Features - 3
- Sample Object-Relational tables (ctd)
- CREATE TABLE dept
- (DEPTNO NUMBER,
- DEPTNAME VARCHAR2(20),
- DEPTADDR AddressType,
- EMPLIST EmployeeListType )
- NESTED TABLE EMPLIST STORE AS EMPLIST_TABLE
- Assume that valid values are stored in table dept
above. - For the query
- select from dept
- XSU generates the following XML document
40Mapping XML Data and OR Features - 4
- lt?xml version'1.0'?gt
- ltROWSETgt
- ltROW num"1"gt
- ltDEPTNOgt100lt/DEPTNOgt
- ltDEPTNAMEgtSportslt/DEPTNAMEgt
- ltDEPTADDRgt
- ltSTREETgt100 Redwood Shores Pkwylt/STREETgt
- ltCITYgtRedwood Shoreslt/CITYgt
- ltSTATEgtCAlt/STATEgt
- ltZIPgt94065lt/ZIPgt
- lt/DEPTADDRgt
- ltEMPLISTgt
- ltEMPLIST_ITEM num"1"gt
- ltEMPNOgt7369lt/EMPNOgt
- ltENAMEgtJohnlt/ENAMEgt
- ltSALARYgt10000lt/SALARYgt
- ltEMPADDRgt
- ltSTREETgt300 Embarcaderolt/STREETgt
- ltCITYgtPalo Altolt/CITYgt
41XML Query Example
- ltPROTEIN xmlnssql"urnschemas-microsoft-comxml-
sql"gt - ltsqlheadergt
- Â ltsqlparam name"DatabaseEntryValue"gtpmap1lt/sq
lparamgt - Â lt/sqlheadergt
- Â ltsqlquerygtSELECT FROM DatabaseEntry WHERE
ENTRYID _at_DatabaseEntryValue FOR XML AUTO - SELECT FROM Entry WHERE ENTRYID
_at_DatabaseEntryValue FOR XML AUTO, ELEMENTS - SELECT FROM Compounds WHERE ENTRYID
_at_DatabaseEntryValue FOR XML AUTO, ELEMENTS - SELECT FROM Source WHERE ENTRYID
_at_DatabaseEntryValue FOR XML AUTO, ELEMENTS - lt/sqlquerygt
- lt/PROTEINgt
42Mapping XML Data and OR Features - 5
- Using the above technique, we can use regular SQL
expressions for insert, update, and delete in
order to modify the data in the database. All
these changes will be reflected in the XML
document when retrieved using the XSU component. - For more examples, you can refer to the online
oracle documentation.