HypersetWeblike Databases and the Experimental Implementation of the Query Language Delta - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

HypersetWeblike Databases and the Experimental Implementation of the Query Language Delta

Description:

Hyperset/Web-like Databases and the. Experimental ... Known query languages to SSD Lorel, UnQL, UnCAL, G-Log, XML-QL, XSLT, XSL and XQuery (XPath) ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 24
Provided by: moly3
Category:

less

Transcript and Presenter's Notes

Title: HypersetWeblike Databases and the Experimental Implementation of the Query Language Delta


1
Hyperset/Web-like Databases and theExperimental
Implementation of the Query Language Delta
  • Mr Richard Molyneux
  • Also co-authored by Dr Vladimir Sazonov
  • Department of Computer Science
  • University of Liverpool

2
Introduction
  • Semi-structured databases (SSD)
  • Schema-less.
  • Self describing.
  • Typically represented by graph.
  • Examples hypertext documents and XML.
  • Web-like databases (WDB).
  • Known query languages to SSD Lorel, UnQL,
    UnCAL, G-Log, XML-QL, XSLT, XSL and XQuery
    (XPath).

3
Introduction Hyperset Approach
  • In general
  • arbitrary sets of sets of sets even with cycles.
  • no prescribed structure.
  • Hypersets a generalisation of the relational
    approach.
  • relational database set of relation.
  • relation set of tuples.
  • tuple set of labelled values.

4
Hyperset Approach
  • Hyperset data are represented as graphs, or
    equivalently as systems of set equations
  • Analogous to the Web hence
  • Web-like databases (WDB).

bob wifealice, nameBob alice
husbandbob,nameAlice, petsam sam
nameSam, speciescat
5
Set Equality Bisimulation
b2 authorjones, titleDatabases p3
authorjones, titleDatabases
  • Therefore, b2 p3
  • or b2 is bisimilar to p3
  • Thus, a book is equal to a paper?
  • or a result of bad
  • database design?
  • Anyway, this WDB
  • is formally allowed as any other graph or system
    of set equations
  • will illustrate bisimulation issues

6
Set Equality BisimulationBetter Database Design
  • Therefore, b2 ? p3
  • or b2 is not bisimilar to p3

b2 authorjones, titleDatabases,
typeBook p3 authorjones,
titleDatabases, typePaper
7
Set Equality Bisimulation
  • Bisimulation equality between any graph nodes
    or set names.
  • Any two sets are equal if
  • In general, this is a recursive procedure of
    computing deep equality

for each (labelled) element of first set there
exists an equal (bisimular) element in the second
set, and vice-versa.
b2 author jones, title Databases, title
Databases p3 title Databases, author
jones
Therefore, b2 p3 or b2 is bisimilar to p3
8
Hyperset Approach - Implementation
  • In our implementation, systems of set equations
    can be transformed into XML representation (and
    vice-versa).
  • In general, arbitrary XML elements can
    participate.

lt?xml version"1.0"?gt ltseteqns xmlnsset"..."gt
ltseteqn setid"bob"gt ltnamegtBoblt/namegt ltwife
setref"alice" /gt lt/seteqngt ltseteqn
setid"alice"gt ltnamegtAlicelt/namegt lthusband
setref"bob" /gt ltpet setref"sam" /gt
lt/seteqngt ltseteqn setid"sam"gt
ltnamegtSamlt/namegt ltspeciesgtcatlt/speciesgt
lt/seteqngt lt/seteqnsgt
bob wifealice, nameBob alice
husbandbob, nameAlice, petsam
sam nameSam, speciescat
9
Delta Query Language to Hyperset WDB
  • Previously theoretical hyperset query language.
  • Sound theoretical background.
  • Expressive power is characterised in terms of
    polynomial time thus
  • computationally viable (in theory),
  • sufficiently complete (no gaps in the language)
  • Considers WDB up to bisimulation.

10
Delta operators
  • Delta has rich expressive power provided by its
    operators.
  • The implemented language retains this expressive
    power with additional features.
  • Delta expressions are divided into
  • Terms (set queries) set valued
  • Formulas (Boolean queries) truth valued

11
Delta Set Valued Operations
  • Collection (separation) similar to SQL select.
  • Recursion recursive version of separate.
  • Decoration, plan performance operator useful
    for restructuring queries.
  • Other useful set theoretic operations
  • Enumeration l1x1,l2x2,,lnxn
  • Union Ux, xUy
  • Transitive closure TC(x)

collect s(l,x) where lx in t and F(l,x)
separate lx in t where F(l,x)
12
Delta Boolean Valued Operations
  • Equality bisimulation.
  • Label relations l1 lt l2, l1 substring of l2,
  • Membership lx in y
  • Logical operators And, Or, Not, Implies
  • Bounded (computable) Quantifiers
  • Everything is bounded in Delta!

forall lx in t F(x,l) exists lx in t F(x,l)
13
Delta Other Features
  • Set queries can participate in Boolean queries,
    and vice versa
  • The implemented language has block structure.
  • Useful things -
  • Libraries and Declarations Ability to define
  • set constants,
  • label constants,
  • queries to be invoked by query calls.
  • If-then-else
  • Full description of Delta syntax (as BNF).

14
Query Execution
  • Three stages
  • Parsing checking that query is well-formed.
  • Contextual analysis checking that
  • query is well-typed,
  • contains no non-declared identifiers.
  • Query Evaluation
  • Extend WDB with new set equation,
  • Result Query
  • Simplify the extended system of set equations
    until no complex set or boolean expressions
    remain.

15
Example of Distributed WDB
  • Distributed WDB two XML files
  • URL1 (represented as
  • system of set equations) -
  • URL2

URL1 grey nodes URL2 white nodes
BibDB bookb1, bookb2, paperURL2p1,
paperURL2p2, paperURL2p3 b1
refers-tob2, refers-toURL2p1 b2
authorJones , titleJones
16
Example of Query
  • Example query (in natural language)
  • Example query (in Delta)

Find all publications which refer to the book b2
in the Bibliography database (BibDB).
set query let set constant BibDB be URL1BibDB,
set constant b2 be URL1b2 in collect
pub-typepub where pub-typepub in BibDB
and exists refers-toref in pub . ref b2
endlet
17
Example of Simple Query
  • Query result (after simplification)
  • Recall the query
  • Find all publications which refer to
  • the book b2 in the Bibliography
  • database (BibDB).
  • p2 also refers to b2,
  • because it refers to p3,
  • which is bisimular (equal) to b2.

result bookURL1b1, paperURL2p2
18
Example of Query Restructuring WDB
  • Transform WDB to any
  • required structure, e.g.

set query let set constant BibDB URL1BibDB in
let set constant restructuredBibDB be (U
collect nullif (LPaper or LBook)
thenpublicationX, typecall
Pair(callSecond(X),L), Lcall
Pair(L, ) else LX fi
where LX in call GraphOfPairs(BibDB) ) in
decorate ( restructuredBibDB, BibDB
) endlet endlet
That one publication has the type both of book
and paper is the result of the initial design of
BibDB. It is not a failure of the above query.
19
Example of query with Path Expressions(Not yet
implemented)
  • Useful feature for selecting nodes to arbitrary
    depth.

Query set query select pub-typex in BibDB
where exists ltb1gtrefers-toltxgtrefers-toltb2gt
. author"Smith" in x Result result
paperURL2p2
20
Remaining Tasks
  • Straightforward computation of bisimulation
    across distributed WBD is intractable.
  • Potential solutions to this problem

Local/global approximations distributed
computation of bisimulation locally in each site,
and using these local approximations to compute
global bisimulation. Maintaining strong
extensionality ensuring that the WDB, and all
updates maintain strong extensionality i.e. all
nodes must be non-bisimular (no redundancies).
21
Comparative Analysis
  • Our approach is top-down (theory to practice),
    compared to bottom-up of most other approaches.
  • UnQL and UnCAL most close to hyperset approach
  • embeddable within Delta, but not vice versa.
  • But their expressive power is theoretically
    unclear.
  • Still more like graph query languages.
  • Lorel is pure graph query language
  • Formally incomparable with Delta.
  • Ignores bisimilation.
  • But there are some similarities with Delta.
  • Most of query languages to semistructured
    databases bear their expressive power from path
    expressions whereas in Delta these are only a
    syntactic sugaring practically very convenient,
    but formally unnecessary.

22
Conclusion
  • Current version of implementation is complete
  • Implementation available online at the Appendix
    Page to this talk
  • http//www.csc.liv.ac.uk/molyneux/ICSOFT2007appe
    ndix/
  • Some key features are not implemented yet such as
  • path expressions
  • distributed evaluation of bisimulation/equality
    in background time.
  • Despite theoretical nature of our approach, we
    have some important practical features such as
  • XML representation of data
  • syntactic sugaring
  • libraries and declarations of queries
  • Delta has rich expressive power ( PTIME) and
    solid mathematical foundations.

23
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com