AVOIDING UNNECESSARY ORDERING OPERATIONS IN XPATH - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

AVOIDING UNNECESSARY ORDERING OPERATIONS IN XPATH

Description:

For brevity, we only discuss the automaton that checks for the ord-property ... Our automaton does not consider ddo-operations in de query plan. ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 18
Provided by: Philippe191
Category:

less

Transcript and Presenter's Notes

Title: AVOIDING UNNECESSARY ORDERING OPERATIONS IN XPATH


1
AVOIDING UNNECESSARY ORDERING OPERATIONS IN XPATH
  • Jan Hidders
  • Philippe Michiels

2
Problem
  • XPath semantics require the result of a query to
    be in doc-order and contain no duplicate nodes
  • Many implementations achieve this by explicitly
    ordering intermediate results or sacrificing
    correctness for efficiency
  • In case of large input documents, this approach
    impacts the performance
  • Motivation Galax

3
XPath implementation
  • Explicit sorting and duplicate elimination after
    each step by inserting distinct-docorder (ddo)
    operations in query plan
  • Semantics of implementation without ddos
    sloppy semantics
  • Example /descendant-or-selfa/childb

4
XPath Properties
  • We need to identify several properties of path
    expressions that assist us in determining whether
    their sloppy semantics are equal to their formal
    semantics
  • Two main properties
  • ord (result always in order)
  • nodup (result never contains duplicate nodes)
  • Additional properties necessary, e.g.
  • gen (all nodes in result belong to same
    generation)
  • max1 (result contains at most one node)
  • unrel (result contains no related nodes)
  • lin (all nodes in result are anc-desc related)

5
Rules for the Inference of Path Properties
  • We define a set of inference rules for the
    deduction of the ord and nodup properties
  • For example

The gen property is preserved by de child,
parent, foll-sibl, prec-sibl axes.
If the gen property holds, then the ord property
is preserved by the parent axis.
6
Deterministic Automata
  • The rules allow us to construct a deterministic
    automaton that verifies whether the sloppy
    semantics of XPath queries have the nodup/ord
    property
  • For brevity, we only discuss the automaton that
    checks for the ord-property

7
Aord Automaton
8
Example (1)
/desc-or-selfa/childb/foll-siblb/parenta
9
Example (2)
/ ? 1
10
Example (3)
/ ? 1
desc-or-selfa ?1,6
11
Example (4)
/ ? 1
desc-or-selfa ? 1,6
childb ? 2,3,4,5,9,10,7,8
12
Example (5)
/ ? 1
desc-or-selfa ? 1,6
childb ? 2,3,4,5,9,10,7,8
foll-siblb ? 3,4,5,4,5,5,10,8
13
Example (6)
/ ? 1
desc-or-selfa ? 1,6
childb ? 2,3,4,5,9,10,7,8
foll-siblb ? 3,4,5,4,5,5,10,8
Parenta ? 1,1,1,1,1,1,1,6
14
Soundness Completeness
  • For the XPath-fragment
  • P A P/A
  • A parentchildancestordescendant...
  • the set of inference rules is sound complete
    for the ord and nodup properties

15
Conclusions
  • We can derive whether a query evaluated by the
    sloppy semantics, returns a result that is free
    from duplicates and/or in document order
  • We can use this knowledge to
  • eliminate unnecessary ddo-operations in the query
    plan
  • rewrite the query to avoid generation of
    unnecessary ordering operations

16
Further Work
  • Our automaton does not consider ddo-operations in
    de query plan. The automaton does not define
    transitions for ddo-operations
  • For example, if after every step in the
    expression child/foll-sibl/child, a
    sorting operation is performed, there is no need
    for sorting at the end of the expression. But our
    algorithm is incapable of deducing this.

17
Further Work (2)
  • How to decide where to sort?

Example /descendant-or-selfa/childb/parenta
Write a Comment
User Comments (0)
About PowerShow.com