Interoperability and XML - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Interoperability and XML

Description:

stylesheet (XSL standard) document type definition (DTD) for well-formed documents ... xsl-stylesheet Slide 14. COM348 Session Eleven. Benefits of XML. Simplicity ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 26
Provided by: davidala3
Category:

less

Transcript and Presenter's Notes

Title: Interoperability and XML


1
Interoperability and XML
  • David Nelson/Sue Patience
  • CAT
  • May 2005

2
Objectives
  • To investigate issues surrounding
    interoperability
  • To gain a basic understanding of XML and its
    developments related to database systems
  • To gain a basic understanding of the use of XML
    towards achieving interoperability

3
Interoperability
  • IEEE (1990) Definition
  • the ability of two or more systems or components
    to exchange information and to use the
    information that has been exchanged
  • IEEE Standard Computer Dictionary A Compilation
    of IEEE Standard Computer Glossaries
  • Current simple solutions
  • mediation
  • transformation

4
Features
  • Exchange of messages and requests
  • Use of each others functionality
  • Client-server abilities
  • Distribution
  • Operate multiple systems as single unit
  • Communication despite incompatibilities
  • Extensibility and evolution

5
The Problems and Difficulties
  • Different data models
  • There can be major semantic differences even
    within the same data model
  • Properties may be called by different names
  • Different data types may be used
  • What about recreating local defined functions?

6
The Problems and Difficulties
  • All this implies we know where they are and we
    have a physical means of getting to them
  • Databases are by their nature protectors of
    data, they do not share easily
  • Many (particularly legacy systems) do not have
    any form of web interface
  • Most databases are security protected
  • Databases do not advertise their services to the
    web

7
Some Simple Problems 1
  • Differing schema
  • author char(50) author_surname char(50)
  • author_inits char(10)
  • title varchar(300) title varchar(200)
  • keyword set(char(30)) keywd array(8) (char(30))
  • - both are valid schema in SQL2003
  • also A.N.Other, A N Other, Other N A, ...

8
Some Simple Problems 2
  • Homogeneous Models
  • the same information may be held as attribute
    name, relation name or a value in different
    databases
  • e.g. library fines
  • as a dedicated relation Fine(amount,
    borrowed_id)
  • as an attribute Loan(id, isbn, date_out, fine)
  • or as a value Charge(1.25, fine)

9
Complex Problems
  • Heterogeneous models
  • Need to relate model constructions to one
    another, for example
  • relate classes in object-oriented to user-defined
    types in object-relational
  • Or even more problematic, to tables in a
    relational database
  • All problems are magnified at this level!

10
Extensible Markup Language (XML)
  • A simplified version of SGML, designed
    specifically for Web documents
  • a meta-language to create customised tags which
    provide functionality not available in HTML
  • links can point to multiple documents
  • links can be bi-directional
  • links to relative objects
  • broken into
  • stylesheet (XSL standard)
  • document type definition (DTD) for well-formed
    documents
  • document data

11
Sample XML Database
  • lt?xml version 1.0 encoding UTF-8
    standaloneyes?gt
  • lt?xmlstylesheet type text/xsl
    hrefstaff_list.xsl?gt
  • lt!DOCTYPE STAFFLIST SYSTEM staff_list.dtdgt
  • ltSTAFFLISTgt
  • ltSTAFF branchNo B005gt
  • ltSTAFFNOgtSL21lt/STAFFNOgt
  • ltNAMEgt
  • ltFNAMEgtJohnlt/FNAMEgtltLNAMEgtWhitelt/LNAMEgt
  • lt/NAMEgt
  • ltPOSITIONgtManagerlt/POSITIONgt
  • lt/STAFFgt
  • ltSTAFF branchNoB003gt
  • lt/STAFFLISTgt

12
Sample DTD
  • lt!ELEMENT STAFFLIST (STAFF)gt
  • lt!ELEMENT STAFF (NAME, POSITION, DOB?, SALARY)gt
  • lt!ELEMENT NAME (FNAME, LNAME)gt
  • lt!ELEMENT FNAME (PCDATA)gt
  • lt!ELEMENT LNAME (PCDATA)gt
  • lt!ELEMENT POSITION (PCDATA)gt
  • lt!ATTLIST STAFF branchNo CDATA IMPLIEDgt

13
Sample StyleSheet
  • lt?xml version 1.0?gt
  • ltxslstylesheet xmlnsxsl http//www.w3.org/TR/
    WD-xslgt
  • ltxsltemplate match /gt
  • lthtmlgtltbodygt
  • ltcentergtlth2gtDreamHome Estate agentslt/h2gtlt/center
    gt
  • lttable border 1 bgcolor ffffffgt
  • lttrgt
  • ltthgtstaffNolt/thgt
  • --- repeat for other column headings
  • ltxslfor-each selectSTAFFLIST/STAFFgt
  • lttrgtltxslvalue-of-selectSTAFFNO/gtlt/tdgt
  • lttrgtltxslvalue-of-selectNAME/FNAME/gtlt/tdgtlt/t
    rgt
  • lt/xslfor-eachgtlt/tablegtlt/bodygtlt/htmlgt
  • lt/xsl-stylesheetgt

14
Benefits of XML
  • Simplicity
  • Open standard and platform/vendor-independent
  • Extensibility
  • Reuse
  • Separation of content and presentation
  • Improved load balancing
  • Due to client side processing

15
Benefits of XML
  • Support for integration of data from multiple
    sources
  • Ability to describe data from a wide variety of
    applications
  • More advanced search engines
  • XQuery
  • New opportunities

16
XML Schema
  • ltxsdgroup-name STAFFTYPE
  • ltxsdelementnameSTAFFgt
  • ltxsdcomplexTypegt
  • ltxsdsequencegt
  • ltxsdelement name STAFFNO
    typeSTAFFNOTYPE/gt
  • ltxsdelement name NAMEgt
  • ltxsdcomplexTypegt
  • ltxsdsequencegt
  • ltxsdelement name FNAMEgt
    type xsdstring/gt
  • ltxsdelement name LNAMEgt
    type xsdstring/gt
  • lt/xsdsequencegt
  • lt/xsdcomplexTypegt
  • ...

17
XQuery
  • A query language for XML
  • e.g. List the staff at branch B005 with a salary
    greater than 15000
  • FOR S IN document(staff_list.xml)//STAFF
  • WHERE S/SALARY gt 15000 AND
  • S/_at_branchNo B005
  • RETURN S/STAFFNO

18
Storing XML in a Relational Database
  • Three approaches
  • Fine Grained
  • Course Grained
  • Medium Grained

19
Fine Grained Approach
  • Good for queries which need to inspect/manipulate
    specific elements in the XML document
  • Not good for queries which manipulate (e.g.
    retrieve/store) the entire document

Child
Element ( parent)
Document
CharData
Attribute
20
Course Grained Approach
  • One table
  • Best for queries which manipulate whole document
  • e.g. retrieve/store a document
  • Worst for queries which manipulate elements
  • e.g. retrieve children of a tag

21
Medium Grained Approach
  • A compromise between fine and course grained
  • Slice document tree up into sections
  • Store sub-sections using a course grained
    approach
  • Good for both types of queries

22
XML RDF
  • Resource Description Framework
  • XML Schema defines a grammar
  • therefore we have all the problems shown
    previously (e.g. names)
  • RDF provides a way to encode domain models
  • an infrastructure that enables the encoding,
    exchange and reuse of structured meta-data (W3C)
  • Defines semantics, syntax and structure
  • this is what we need for interoperable systems

23
RDF Data Model
  • RDF Data Model consists of three objects
  • Resource
  • anything that can have a URL
  • Property
  • a specific attribute which is used to describe a
    resource
  • Statement
  • a combination of a resource, a property and a
    value
  • e.g. The Author of http//www.dreamhome.co.uk/
    staff_list.xml is John White

24
Summary
  • XML is being increasingly used in data models,
    data transmission and data integration
  • Interoperability is the key issue and a major
    research area in database systems
  • XML and RDF have the potential as a stepping
    stone to achieving this

25
Further Reading
  • Connolly and Begg
  • chapter 29 (sections 29.1, 29.2 and 29.3)
    discusses XML and its related technologies
  • Graves
  • Designing XML Databases, Prentice Hall.
  • XML Tutorial
  • www.w3cschools.com/xml
  • RDF introduction (Idiots Guide!)
  • http//archive.dstc.edu.au/RDU/reports/RDF-Idiot/
Write a Comment
User Comments (0)
About PowerShow.com