Containment of Partially Specified TreePattern Queries - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Containment of Partially Specified TreePattern Queries

Description:

YAMAHA. BMW. ON-OFF. 200cc. SERROW. TRAVEL. F650GS. 650cc. NJ ... He has a Yamaha Serrow motorbike in Greece. He searches for spare parts in Greece or USA. ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 45
Provided by: TD1
Category:

less

Transcript and Presenter's Notes

Title: Containment of Partially Specified TreePattern Queries


1
Containment of Partially Specified Tree-Pattern
Queries
Dimitri Theodoratos (NJIT, USA) Theodore
Dalamagas (NTUA, GREECE) Pawel Placek (NJIT,
USA) Stefanos Souldatos (NTUA, GREECE) Timos
Sellis (NTUA, GREECE)
2
IntroductionData ModelAdditional ConceptsQuery
ContainmentExperimentsConclusion
3
Motivating Example (?)
  • Tree structure (e.g. XML) with motorbike spare
    parts.
  • We search for spare parts.
  • BUT

4
Motivating Example (?)
  • Dimitri Theodoratos lives in NJ.
  • He has a Yamaha Serrow motorbike in Greece.
  • He searches for spare parts in Greece or USA.
  • ? structural difference

5
Motivating Example (?)
  • Theodore Dalamagas has a BMW motorbike.
  • He looks for spare parts worldwide.
  • ? structural inconsistency

../F650GS/650cc
../650cc/F650GS
6
Motivating Example (?)
  • Stefanos Souldatos has a Honda Varadero.
  • But, he is not fully aware of the tree structure.
  • ? unknown structure

7
Motivating Example (?)
  • Pawel Placek wants to buy a motorbike that he can
    easily find spare parts for.
  • He searches in many different tree structures.
  • ? source integration

8
Motivation
  • ? Querying tree-structured data
  • BUT
  • ? structure is not always strictly defined
  • ? user does not always deal with structure
  • ? Find Honda spare parts in Greece.

9
Our Approach
  • Dimensions semantically related nodes.
  • Dimension Graphs summary of the tree structure.
  • Query Language partial specification of the
    structure (Partially Specified Tree-Pattern
    Queries).
  • We study the problem of Query Containment for
    Partially Specified Tree-Pattern Queries.

10
IntroductionData ModelAdditional ConceptsQuery
ContainmentExperimentsConclusion
11
Dimension Graph
dimension graph summary of the tree structure
DIMENSIONS
R (oot)
C (ountry)
L (ocation)
B (rand)
T (ype)
M (odel)
E (ngine)
12
Dimension Graph
  • offers a summary of the structure of the tree.
  • provides the necessary semantics for query
    formulation.
  • sets the framework for querying sources with
    structural differences and inconsistencies.
  • supports query evaluation and optimization.

DIMENSIONS
R (oot)
C (ountry)
L (ocation)
B (rand)
T (ype)
M (odel)
E (ngine)
13
Partially Specified Tree-pattern Query
DIMENSIONS
R (oot)
C (ountry)
L (ocation)
B (rand)
  • Query Find shops with spare parts for all models
    and all engines of BMW motorbikes in Greece. (
    structural info)

T (ype)
M (odel)
E (ngine)
14
Partially Specified Tree-pattern Query
DIMENSIONS
R (oot)
C (ountry)
partially specified paths (PSP)
L (ocation)
B (rand)
  • Query Find shops with spare parts for all models
    and all engines of BMW motorbikes in Greece. (
    structural info)

T (ype)
M (odel)
E (ngine)
15
Partially Specified Tree-pattern Query
DIMENSIONS
R (oot)
C (ountry)
output path ()
partially specified paths (PSP)
L (ocation)
B (rand)
  • Query Find shops with spare parts for all models
    and all engines of BMW motorbikes in Greece. (
    structural info)

T (ype)
M (odel)
E (ngine)
16
Partially Specified Tree-pattern Query
parent child
ancestor descendant
DIMENSIONS
R (oot)
C (ountry)
output path ()
partially specified paths (PSP)
L (ocation)
B (rand)
  • Query Find shops with spare parts for all models
    and all engines of BMW motorbikes in Greece. (
    structural info)

T (ype)
M (odel)
E (ngine)
17
Partially Specified Tree-pattern Query
node sharing expression (NSE)
parent child
ancestor descendant
DIMENSIONS
R (oot)
C (ountry)
output path ()
partially specified paths (PSP)
L (ocation)
B (rand)
  • Query Find shops with spare parts for all models
    and all engines of BMW motorbikes in Greece. (
    structural info)

T (ype)
M (odel)
E (ngine)
18
IntroductionData ModelAdditional ConceptsQuery
ContainmentExperimentsConclusion
19
Additional Concepts
Full Form Query
20
Additional Concepts
Full Form Query
Dimension Trees
DIMENSION TREES QUERY GRAPH
21
IntroductionData ModelAdditional ConceptsQuery
ContainmentExperimentsConclusion
22
Absolute Containment
Each result of Q1 is a result of Q2.
?
Q1 ? Q2
23
Absolute Containment
Each result of Q1 is a result of Q2.
?
Q1 ? Q2
homomorphism from Q2 to Q1
24
Absolute Containment
Each result of Q1 is a result of Q2.
?
Q1 ? Q2
homomorphism from Q2 to Q1
Q1
Q2
PSP p2
PSP p1
PSP p4
PSP p3
25
Relative Containment (w.r.t. G)
Each result of Q1 in G is a result of Q2 in G.
?
Q1 ?G Q2
26
Relative Containment (w.r.t. G)
Each result of Q1 in G is a result of Q2 in G.
?
Q1 ?G Q2
homomorphism from the Dimension Trees of Q2 to
the Dimension Trees of Q1
27
Relative Containment (w.r.t. G)
Each result of Q1 in G is a result of Q2 in G.
?
Q1 ?G Q2
homomorphism from the Dimension Trees of Q2 to
the Dimension Trees of Q1
A dimension tree of Q1
A dimension tree of Q2
28
Relative Containment Heuristic
1msec Absolute Containment (AC)
100msec Relative Containment (RC)
29
Relative Containment Heuristic
Relative Containment Heuristic (RCH)
1msec Absolute Containment (AC)
100msec Relative Containment (RC)
  • ? sound but not complete
  • extract structural information from the Dimension
    Graph
  • insert it in the query Q1
  • check Q1 ? Q2 instead of Q1 ?G Q2

30
Relative Containment Heuristic
  • Example

Q1
Q2
Q1 ? Q2
C ?
B ?
B ?
T ?
PSP p1
PSP p2
31
Relative Containment Heuristic
  • Example

Q1
Q2
BT R-C, CB
Q1 ? Q2
C ?
B ?
B ?
T ?
PSP p1
PSP p2
32
Relative Containment Heuristic
  • Example

Q1
Q2
BT R-C, CB
Q1 ? Q2
R ?
C ?
C ?
B ?
B ?
Q1 ?G Q2
T ?
PSP p1
PSP p2
33
IntroductionData ModelAdditional ConceptsQuery
ContainmentExperimentsConclusion
34
Experiments
  • We measured
  • execution time for
  • Absolute Containment (AC)
  • Relative Containment (RC)
  • Relative Containment Heuristic (RCH)
  • accuracy for RCH
  • for various graph sizes
  • for various query sizes

35
Time
Graph dimensions 30
Graph dimensions 40
Graph dimensions 20
RC
RC
RC
RCH
RCH
RCH
Time (msec)
AC
AC
AC
Graph paths 10 - 80
Graph paths 15 - 120
Graph paths 20 - 160
Query PSPs 1
Query PSPs 2
RC
RC
Time (msec)
RCH
RCH
AC
AC
Nodes per PSP 3 - 6
Nodes per PSP 3 - 6
36
Accuracy of RCH
  • 80 for graphs of common sizes
  • based on XML benchmarks (XMach, XMark, etc.)
  • 50 for graphs of higher density

37
IntroductionData ModelAdditional ConceptsQuery
ContainmentExperimentsConclusion
38
Conclusion
  • Query Containment for Partially Specified
    Tree-Pattern Queries (PSTPQs).
  • Sound technique for checking Relative Query
    Containment
  • Time one order of magnitude
  • Accuracy over 80

39
Future Work
  • Heuristics for checking Relative Containment
  • precomputed and on-the-fly
  • trade-off between time and accuracy
  • Special forms of queries, e.g. swings

40
Questions?
41
Links
  • Introduction (2-9)
  • Data Model (10-17)
  • Additional Concepts (18-20)
  • Query Containment (21-32)
  • Experiments (33-36)
  • Conclusion (37-41)
  • Appendix (42-46)

42
Appendix
43
Who defines the dimensions?
  • Automatic
  • XML tags (dimension graph path summary, path
    index, structural summary)
  • Semi-automatic
  • Graph administrator XML tags
  • (dimension group of XML tags)
  • Graph administrator ontology
  • Manual
  • Graph administrator

44
Inference Rules
INFERENCE RULES (IR1) - Rp1 ? Rp2 (IR2)
Ap1 ? Ap2, Ap2 ? Ap3 - Ap1 ?
Ap3 (IR3) a structural expression that involves
Ap - Rp Ap (IR4) Ap ? Bp - Ap
Bp (IR5) Ap Bp, Bp Cp - Ap
Cp (IR6) Ap ? Bp, Ap Cp - Bp
Cp (IR7) Ap ? Bp, Cp Bp - Cp
Ap (IR8) Ap1 ? Bp1, Bp1 ? Bp2 - Ap2
? Bp2 (IR9) Ap1 Bp1, Bp1 ? Bp2 -
Ap2 Bp2 (IR10) Ap1 Bp1, Ap1 ?
Ap2, Rp2 Bp2 - Ap2 Bp2 (IR11)
Ap1 Bp1, Bp1 ? Bp2 - Ap1 ?
Ap2 (IR12) Ap1 ? Bp1, Cp2 ? Bp2, Dp1
? Dp2 - Dp1 Ap1 (IR13) Ap1 ? Bp1,
Ap2 ? Cp2, Dp1 ? Dp2 - Dp1
Ap1 (IR14) Ap1 Bp1, Bp2 Ap2,
Cp1 ? Cp2 - Cp1 Ap1
1. Full Form Query
45
Dimension Trees
r/Greece/BMW/ TE/M
r/Greece/BMW/ T/M E
r/Greece/BMW/ T/E/M
r/Greece/BMW/ TM/E/EM
46
Previous Approaches
  • Keyword-based search approach
  • Absence of structure
  • Naive approach
  • All possible query patterns are generated
  • (HondaGreece, GreeceHonda)
  • Approximation techniques
  • Relax the query ? more answers
  • Traditional integration approach
  • Global structure and mapping rules
Write a Comment
User Comments (0)
About PowerShow.com