Structural Joins: A Primitive for Efficient XML Query Pattern Matching PowerPoint PPT Presentation

presentation player overlay
1 / 21
About This Presentation
Transcript and Presenter's Notes

Title: Structural Joins: A Primitive for Efficient XML Query Pattern Matching


1
Structural Joins A Primitive for Efficient XML
Query Pattern Matching
  • Alkh02

2
Example XPath Query
  • booktitleXML//author.jane
  • Structural Relationships
  • book/title, title/XML, book//author,
    author/jane

book
title
author
XML
jane
3
Overview
  • Range-based XML Numbering Scheme
  • (DocId, StartPosEndPos, LevelNum)
  • Structural Relationships
  • (D2, S2 E2, L2) is a descendant of (D1,S1
    E1,L1) iff D1D2, S1 lt S2 and E2 ltE1
  • Parent-child above conditions L1 1 L2
  • 2 Families of Structural Join Algorithms
  • Tree-Merge Anc Desc
  • Stack-Tree Anc Desc

4
A Sample XML Document Fragment Tree
Representation
5
Structural Join Algorithms
  • AList a1, a2,
  • list of potential ancestors, sorted on StartPos
  • DList d1, d2,
  • list of potential descendants, sorted on StartPos
  • OutputList (ai,dj),
  • join results, sorted
  • Either by (DocId, ai.StartPos, dj.StartPos) //
    Anc version
  • Or by (DocId, dj.StartPos, ai.StartPos) // Desc
    version

6
Algorithm Tree-Merge-Anc
7
Algorithm Tree-Merge-Desc
8
Stack-Tree Algorithms
  • Depth first traversal of XML tree
  • Conceptual merge of AList nodes DList nodes on
    StartPos
  • Stack of AList nodes
  • Node pushed onto the stack is a descendant of the
    node below it on the stack
  • 3 cases (Stack-Tree-Desc version)
  • A/DList node is not a descendant of stack top
    pop
  • AList node is a descendant of stack top push
  • D List node is a descendant of stack top output

9
Algorithm Stack-Tree-Desc (parent/child case)
a1 d1 a2 d2 . . . . an dn dn1 dn2
. . d2n
a1
d1
d2n
a2
an
d2
d2n-1
a3
. . .
...
d3
d2n-2
an
dn
dn1
a2
a1
? e.startPos gt stack-gttop.endPos

(a1,d1)
(a2,d2)
...
(an-1,dn-1)
(an,dn)
(an,dn1)
(an-1,dn2)
...
(a3,d2n-2)
(a2,d2n-1)
(a1,d2n)
10
Algorithm Stack-Tree-Desc
11
Algorithm Stack-Tree-Anc
  • Problem
  • Sorting on StartPos of DList nodes is
    natural/easy
  • Sorting on StartPos of AList nodes is not
  • Solution
  • keep 2 lists of matching descendant nodes with
    each stack node
  • self-list
  • inherit-list

12
Algorithm Stack-Tree-Anc(parent/child case)
? e.startPos gt stack-gttop.endPos
a1 d1 a2 d2 . . . . an dn dn1 dn2
. . d2n
an
(an,dn)
(an,dn1)
. . .
(an-1,dn-1)
(an,dn), (an,dn1)
. . .
a2
(a2,d2)
(a2,d2n-1)
(a3,d3),(a3,d2n-2)...(an,dn),(an,dn1)
a1
(a1,d1)
(a1,d2n)
(a2,d2),(a2,d2n-1)...(an,dn),(an,dn1)
13
Algorithm Stack-Tree-Anc
14
Experiment XML Data Queries
15
Experimental Results
16
Efficient Structural Joins on Indexed XML
Documents
  • Chie02

17
Motivation (Why using indices?)
18
A Sample XML Document
19
XML Data Indexed with B Tree
  • Key (DocID, tag name, StartPos)

20
Algorithm Anc_Des_B
21
Typo
  • Section 3. Structural Join using B-trees
    Chie02
  • 4-th paragraph (i.e. 1-st paragraph of the right
    column of 4-th page)
  • Correction
  • Figure 3a depicts (2) pop a3 and a2 from the
    stack A list (4) push as follows after a2
    is popped from the stack, directly go to a14.
    Here than a2.end.
  • ?
  • Figure 3a depicts (2) pop a3 from the stack
    A list pop a2 (4) push as follows after a3 is
    popped from the stack, directly go to a14 (after
    popping up a2). Here than a2.end.
  • Note
  • The above paragraph is where how algorithm
    Stack_Tree_Desc Alkh 02 would work for the case
    of Fig. 3(a) is described. According to algorithm
    Stack_Tree_Desc Alkh 02, the corrected
    description is accurate.
Write a Comment
User Comments (0)
About PowerShow.com