An XML Object Database: Design Implementation and Applications - PowerPoint PPT Presentation

1 / 99
About This Presentation
Title:

An XML Object Database: Design Implementation and Applications

Description:

The environment provides a template creator which consists of a DTD schema ... the template, such as layout, size, color, position, etc. ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 100
Provided by: UFO
Category:

less

Transcript and Presenter's Notes

Title: An XML Object Database: Design Implementation and Applications


1
An XML Object Database Design Implementation
and Applications
  • Ching-Long Yeh ? ? ?
  • Department of Computer Science and Engineering
  • Tatung University
  • Taipei 104, Taiwan
  • ROC

2
Introduction
  • XML improves upon HTML in
  • capturing the meaning of a document and
  • extending the tag set.
  • At the same time, it also reduces the complexity
    of SGML.
  • It is believed that XML will soon be the standard
    of data exchanges on the Web.

3
Introduction
  • Due to lack of indices in files, we are not able
    to make full use of the meaning (or metadata) in
    an XML document, if it is stored in a file.
  • Since an XML document can be easily viewed
    according to the object-oriented model, a
    promising solution is to employ object database
    technology to manage the access of XML documents.

4
Introduction
  • In this talk, I will present our research work in
  • the design and implementation of an XML object
    DB,
  • an extensible template-based query interface to
    accessing to XML object database, and
  • the applications implemented on the XML object
    database
  • content-based video query system, and
  • electronic commerce

5
The Remainder of the Talk
  • An Introduction to XML
  • Design and Implementation of an XML Object
    Database
  • An Extensible Template-based Interface
  • A Content-Based Query Interface to Video Database
  • XML Object Database and Electronic Commerce

6
An Introduction to XML
7
HyperText Markup Language
  • HTML is a language used to create hyperlink text
    in the WWW.
  • The text is presented according to a set of
    predefined tags.
  • The definition of tags is based on the Document
    Type Definition (DTD) of SGML.
  • In other words, HTML is an application of SGML in
    the WWW.

8
Standard Generalized Markup Language
  • Central to SGML is the concept that documents
    have structure, content, and format.
  • These three ingredients combine to form a
    document.

9
Content
  • What is Content?
  • Content is the actual data within a document.
  • The words and illustrations that make up a
    bicycle assembly manual are its contents.

10
Format
  • What is Format?
  • Format consists of how the words, sentences, and
    paragraphs are visually presented and
    distinguished from one another within a document.
  • Boldface for title, italics for special terms,
    and blank lines between sections are examples of
    document formats.
  • People often confuse format with structure.

11
Structure
  • What is Structure?

Recipe
Title
Coconut Pudding
Ingredient List
Ingredient
Instruction List
Step
12
Document Type Definition
  • Defining structures in SGML
  • The structure of a document ? its type ? is
    defined by a document type definition, or DTD.
  • The DTD lays out the rules for a document through
    the use of elements, attributes, and entities.

13
Document Type Definition
  • A DTD looks like

lt!ELEMENT recipe -- ( title,
ingredientList,
instructionList)gt lt!ELEMENT title --
(PCDATA)gt lt!ELEMENT ingredientList --
(ingredient)gt lt!ELEMENT instructionList --
(step)gt lt!ELEMENT ingredient --
(PCDATA) gt lt!ELEMENT step -- (PCDATA)gt
14
Document Instance
lt!DOCTYPE RECIPE PUBLIC recipe"
recipe"gt ltRECIPEgtltTITLEgtCoconut
Puddinglt/TITLEgt ltINGREDIENTLISTgt ltINGREDIENTgt
12 ounces coconut milklt/INGREDIENTgt
ltINGREDIENTgt 4 to 6 tablespoons sugar
lt/INGREDIENTgt ltINGREDIENTgt 4 to 6 tablespoons
cornstarch lt/INGREDIENTgt ltINGREDIENTgt 3/4 cup
water lt/INGREDIENTgt ltINGREDIENTLISTgt ltINSTRUCTIONL
ISTgt ltSTEPgt Pour coconut milk into saucepan.
lt/STEPgt ltSTEPgtCombine sugar and cornstarch
stir in water and blend well. lt/STEPgt
ltSTEPgtStir sugar mixture into coconut milk cook
and stir over low heat until
thickened. lt/STEPgt lt/INSTRUCTIONLISTgt lt/RE
CIPEgt
15
HTML, SGML, XML
  • HTML helped establish the Internet by providing a
    universal way to present information.
  • However, HTML only addresses the presentation of
    data.
  • Using SGML, user can add structure along with the
    content of a document.
  • However, SGML has proven too heavy-weight for the
    Internet.

16
Extensible Markup Language
  • The XML is a simple dialect of SGML.
  • HTML is sufficient for sending web pages that are
    viewed by human beings.
  • XML, however, adds the tags that enable computers
    to understand, act on or process the information.
  • XML has been designed for ease of implementation
    and for interoperability with both SGML and HTML.

17
XML Application Profile
  • Electronic commerce
  • Electronic data interchange (EDI)
  • Fine-grain content publishing
  • Internet search engines
  • Distributed application design
  • etc.

18
Data Type Requirements of Documents
  • HTML
  • One file per page
  • Simple uni-directional linking
  • XML
  • Tens, hundreds or even thousands of objects per
    page
  • Multiple DTDs
  • Hierarchical structure and rich linking
  • Query and navigation capabilities required
  • Agents and business rules interact with the data

19
Data Types of Storage File System
  • File system
  • Store monolithic stuff.
  • Folder system on top of them
  • Good at storing multimedia data

20
Data Types of Storage Relational DB
  • Relational database
  • Tabular in nature
  • Good at storing rows and columns of data like
    spreadsheets and data from forms like invoices.

21
Data Types of Storage Object-Oriented DB
  • Object-oriented database
  • Good at managing structured, hierarchical rich
    linked information.
  • Thats exactly what XML is.
  • XML is the object representation of data.

22
Design and Implementation of an XML Object
Database
23
System Architecture
24
DTD Parser
25
Parsing Result
26
Schema Generation
27
DI Parser
28
DI Parser Generation
for each contentModel(ElementName,ContentStructure
) do generate the rule head for ElementName
generate the start tag for ElementName
generate the rule body for ContentStructure
generate the end tag for ElementName generate
the semantic action
29
Implementation
  • We have built a prototype of the system using LPA
    Win-Prolog V3.5 on personal computer.
  • It consists of a DTD parser, Schema generator and
    DI parser generator.
  • After creating the physical store and class
    family for XML documents, we can proceed to build
    the database schema for DTD by executing the ODQL
    codes generated by the DTD schema generator.

30
(No Transcript)
31
(No Transcript)
32
An Extensible Query-By-Template Interface to
Accessing XML Document Database
33
Motivation
  • Vastness of search results on current WWW search
    engines
  • Textual-based query language with a simple
    English-like syntax is inconvenient for the user.
  • Current user interfaces primarily use form-based
    queries.

34
Goal
  • The goal is to design a convenient interface for
    user to access XML document without knowing the
    knowledge of the document types.
  • The interface will relieve user from typing
    complex query language.
  • The interface should be web-based and
    platform-independent.

35
System Architecture
Visual Query Interface
36
Related Knowledge
  • XML( eXtensible Markup Language )
  • Jasmine Database
  • Structured Document Database
  • Visual Query Facility
  • Java Language

37
XML
  • XML has the potential to be the standard of WWW
    document and electronic data interchange of the
    future.
  • In XML, the structure of the document is defined
    using a Document Type Definition (DTD).

38
Recipe DTD
39
Recipe Document
40
Jasmine Database
  • Jasmine is a multimedia object-oriented database
    management system(DBMS) with built-in
    Web-connectivity.
  • It provides a powerful Object Data Query
    language(ODQL) which is very similar to the ODMG
    2.0(ODL,OML,OQL) standard.
  • It provides an extensive array of application
    development tools, which includes JADE, ActiveX,
    C-API, J-API, Weblink.

41
Strucutured Document Database
  • Combine structured document with OODB technology
  • VERSO project at INRIA
  • News-On-Demand Application
  • Document Database from GMD-IPSI
  • Other Related Document Database
  • Open Text
  • DocBase
  • The Poet XML Repository
  • ODIs eXcelon

42
Visual Query Facility
  • Query By Example (QBE)
  • The interface is composed of tabular skeletons
    representing tables in the database.
  • Query By Forms (QBF)
  • The interface is presented with a list of
    searchable fields, each with an entry area that
    can be used to indicate the search string.
  • Query By Template (QBT)
  • The interface is displayed a template for a
    representative entry of the database. User
    express their queries by indicating the search
    keywords in the appropriate regions of the
    template.

43
Document Database Schema
44
Building an XML Document Database
45
Example of Image-based QBT
46
Limits of Image-based QBT
  • The image template is divided into regions, each
    of which corresponds to an element in the
    document structure.
  • Associated with each regions is the query action.
  • Its significant drawback is the lack of
    flexibility in the template creation.
  • It is difficult to automate the task of
    reconfiguration of query action associate with
    the new template.
  • A single interface template for all types of
    document is probably not a good idea.

47
Concept of eXtensible QBT (XQBT)
  • The environment provides a template creator which
    consists of a DTD schema browser and a scene for
    presentation design.
  • The environment aims at providing automatic
    configuration of query actions associated with
    presentation of template.
  • The design of the template presentation must be
    tightly coupled with the arrangement of document
    data stored in the repository.
  • The component in the design of presentation must
    be properly associated with corresponding nodes
    in the object database schema.

48
Environment for XQBT
49
Template Creator
  • The template creator consists of a DTD schema
    browser a scene for template draft, and
    functional area.
  • The template creator in mainly relied on a DTD
    schema browser, which corresponds to the database
    schema.
  • The scene is a visual display area where the
    designer can organize a template draft for
    certain purpose.
  • The content of template draft is exported to a
    file, which contains the template presentation
    and additional information.

50
Template Creator
Functional Area
51
Exported File
  • The file contains the information about the
    template presentation property associate with
    each element.
  • Each element is appended with the path
    information in the database schema, in order that
    the template executor, which can make use of the
    information to carry out query actions.

52
Template Executor
  • The template executor loads the exported file and
    presents the template as was originally designed
    in the template creator.
  • The path of each node in the DTD schema browser
    is used to carry out the query action required by
    the user.

53
Comparison between Image-based QBT and XQBT
XQBT
QBT
  • The template is an image by taking a photograph
    or by scanning from existing pages.
  • The query action associate with each region is
    hand-coded.
  • Either planar or nested template is limited to
    region level that is not very deep.
  • The template is generated for a representative
    document.
  • The associated query action can be generated
    automatically for the interface program.
  • The designer can change the template to meet the
    requirement of various region level.

54
Implementation
  • Java Proxies (Jp) for Jasmine
  • Jp allows developer to build their application in
    J-API, and take advantage of Jasmine class
    libraries.

55
The interface for our XML document database
ingredient
Ingredient name
Ingredient step
56
Query Formulation
  • Such searches are performed by simply entering
    the search string in the corresponding region of
    the template.

57
Query Formulation (cont.)
58
Query Formulation (cont.)
  • The multiple condition are specified in different
    regions which are combined using logical
    conjunctions(such as AND, OR, NOT).
  • The approach used to derive the logical
    expression
  • from its graphical representation is using
    the default precedence.
  • User can insert parentheses as necessary in the
  • condition box, which used in QBE interface.

59
The results of the query formulation
60
Template Creator
61
Template Executor
62
Future Works
  • A first step towards enhancement is improved the
  • template ability in order to support more
    complex
  • query facility.
  • An enhancement of the template creator would
  • be to provide more sophisticated facility
    for manage
  • the template, such as layout, size, color,
    position, etc.
  • We will try to include other document types to
    test
  • the applicability of the XQBT.

63
A Video Content Query SystemBased on an OODB
64
Introduction
  • We store the content description of video in an
    OODB (JasmineCA) to provide the user to query
    video segments according to the video content.
  • A VDBMS needs to address the following important
    issues
  • Video data modeling
  • Video data insertion
  • Video data indexing
  • Video data query and retrieval

65
Introduction(cont.)
  • Overview of System Architecture

66
Related Research
  • R. Jain and A. Hampapur, Metadata in Video
    Databases.
  • Application of Video
  • Query Dimensionality
  • M. Carrer, L. Ligresti, G. Ahanger, and T.D.C.
    Little, An Annotation Engine for Supporting
    Video Database Population.
  • Video Segmentation
  • A Newscast Video Data Model

67
Related Research
  • E. Hwang and V. S. Subrahuanian, Querying Video
    Libraries.
  • A Formal Model of Video Data Structures
  • R. Hjelsvold and R. Midtstraum, Modelling and
    Querying Video Data.
  • Structure of the Generic Video Data Model

68
Related Research
  • Dublin Core-based Video Description Scheme
  • Hunter proposes to extend part of Dublin Core
    elements, i.e., Type, Description, Format,
    Relation, and Coverage, to cope with video
    content metadata requirements.
  • Hunter breaks film and video documents into the
    following hierarchical segments

Sequence Scene Shot
Frame
Object/Actor/Person
69
Research Issues
  • Video Data Modeling
  • Characteristics of Video Data
  • Video Logical Structure
  • Content of Video Data

70
Research Issues
  • Video data insertion
  • Extract key information
  • Break the given video stream into a set of basic
    units.
  • Manually or semi-automatically annotate the video
    unit.
  • Index and store video data into the video database

71
Research Issues
  • Video Data Indexing
  • Annotation-Based Indexing
  • Feature-Based Indexing
  • Domain-Specific Indexing

72
Research Issues
  • Video Data Query
  • Query content
  • Query matching type
  • Query granularity
  • Query behavior
  • Query specification

73
Research Issues
  • Content Description Language
  • Every video subset of a video has a set of
    associated objects and associated activities,
    which can be what we may describe.
  • Content description language is used to describe
    the video content and video structure.
  • An example is to describe video content by
    applying qualified Dublin Core to a hierarchical
    segmented video structure .

74
System Architecture
System Architecture
75
(No Transcript)
76
System Architecture
  • Video Content Annotating Program
  • Output in two formats one is in XML and the
    other is in an object database language, the
    Object Data Query Language (ODQL) of Jasmine.
  • The video content annotating program of the
    system employs a bottom-up approach to guide the
    human annotator to describe the content of video
    segments.

77
System Architecture
Bottom-Up
78
System Architecture
  • Query Interface
  • The matching type used in the query interface is
    keyword match, which is a kind of exact match.
  • The user specifies the keyword he or she wishes
    to find in the value field of each attribute, the
    interface then looks for the content description
    of each SHOT in the object database in order to
    find the satisfied ones.

79
Implementation
  • The Jasmine Object-Oriented Database
  • We use Jasmine OODB CA to store the video
    content description.
  • Jasmine provides the ODQL to define, manipulate,
    and query the data in OODB
  • An ODQL program

SHOT shot0 shot0SHOT.new(shotnumber0,name"Sh
ot_0",filename"F\\G.ARMANI\\Shot1.MPG",starttim
e"0000",stoptime"0013",commentcomment0)
80
Implementation
  • The Content Description of Fashion Show Video

lt!ELEMENT VIDEO (ABSTRACT?,SEQUENCE)gt
lt!ELEMENT SEQUENCE (DESIGNER,SCENE)gt
lt!ELEMENT SCENE (TOPIC,BACKGROUND,SHOT)gt
lt!ELEMENT SHOT (CLOTHES,ACCESSORY)gt
lt!ELEMENT ABSTRACT (PCDATA)gt lt!ELEMENT
BACKGROUND EMPTYgt lt!ELEMENT CLOTHES
EMPTYgt lt!ELEMENT ACCESSORY EMPTYgt lt!ELEMENT
DESIGNER EMPTYgt lt!ELEMENT TOPIC
EMPTYgt
81
Implementation
lt!ATTLIST SCENE scenenumber NUMBER
REQUIRED name CDATA IMPLIED
starttime CDATA REQUIRED endtime
CDATA REQUIREDgt
lt! ATTLIST CLOTHES name CDATA
REQUIRED department CDATA REQUIRED
type (MenWomenChildren) Men color
CDATA REQUIRED season (ArbitrarySpringSu
mmerFallWinter) Arbitrary
fabric CDATA IMPLIED narrative CDATA
IMPLIED gt
82
(No Transcript)
83
Implementation
  • Video Content Annotation Program

84
Implementation
85
(No Transcript)
86
(No Transcript)
87
(No Transcript)
88
Video Content Query Interface
89
Conclusion
  • A video content-based query system
  • A hierarchical scheme for the content
    descriptions of fashion show video
  • The content description of video is made by using
    XML DTD.
  • The annotation of video content description based
    on the scheme is stored in an object database
  • We then build the form-based query interface on
    the object database.

90
Future Work
  • The problem of video transmission.
  • How detailed should we describe the video
    content?
  • In the future we will develop a Query-By-Template
    interface with key frame image, and an iterative
    query interface to allow user to incrementally
    refine their queries until the satisfying result
    is obtained.
  • The study of how to produce the query interface
    automatically with different kinds of video.

91
An Agent-Based EC System Based on an OODB
92
Background
  • Software Agent
  • XML
  • KQML

93
Software Agent
  • Properties
  • Autonomous, Reactive, Goal-driven, Persistent,
    Social, Intelligent, Mobile
  • Agency
  • A collection of software agents that communicate
    and cooperate with each other is called an agency.

94
XML
  • XML(eXtensible Markup Language) is a description
    language for structural documents it is a markup
    language, but unlike HTML it does not keep a
    fixed set of tags.

95
KQML
  • KQML (Knowledge Query and Manipulation Language)
    is a language and protocol for exchanging
    information and knowledge. KQML is both a message
    format and a message-handling protocol to support
    run-time knowledge sharing among agents.
  • A KQML message is called a performative, in that
    the message is intended to perform some action by
    virtue of being sent.

96
System Architecture
97
Agency
  • Agent
  • Facilator
  • Authentication
  • Message Handler
  • Reasoning
  • Document Handler
  • Resource Manager
  • KQML Interpreter

98
Analysis of System Components
  • Facilitator
  • authentication, create agents, delete agents,
    sleep agents, resume agents, and communicate with
    others.
  • Agent
  • action handling, display result, and communicate
    with others.
  • Message Handler
  • send message, receive message, and parse message.

99
Future Work
  • How the schema hierarchy of XML object database
    affects the performance of accessing the
    document.
  • Extensible template-based query interface
  • Applications of XML object database in electronic
    commerce
Write a Comment
User Comments (0)
About PowerShow.com