Converting Disjunctive Data to Disjunctive Graphs - PowerPoint PPT Presentation

About This Presentation
Title:

Converting Disjunctive Data to Disjunctive Graphs

Description:

Polynomial time, under certain circumstances [Lobo et. al, 1995] 3. 3. The Problem ... How do we represent a disjunctive database in storage? ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 14
Provided by: larso
Learn more at: https://www.deg.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Converting Disjunctive Data to Disjunctive Graphs


1
Converting Disjunctive Data to Disjunctive Graphs
  • Lars Olson
  • Data Extraction Group
  • Funded by NSF

2
Introduction
  • Disjunctive databases
  • Needed to represent disjunctive data
  • Queries are CoNP-complete in general Imielinski
    and Vadaparty, 1989
  • Transitive closure in disjunctive graphs
  • CoNP-complete in general
  • Polynomial time, under certain circumstances
    Lobo et. al, 1995

3
The Problem
  • How do we convert the data into a disjunctive
    graph?
  • What is the complexity of the conversion?
  • Time
  • Space / Memory

4
Implementation
  • XML data repository
  • Shore / Niagara (Univ. of Wisconsin)
  • Xerces XML parser (Apache.org)
  • How do we represent a disjunctive database in
    storage?
  • Needs to be easy to convert to disjunctive graph
  • Needs to minimize the changes to the DTD and
    thus, the existing data

5
XML ? Graph Conversion
doc
  • XML ? DOM tree

Node
ltdocgt ltNode nameAgt ltEdgeTo
refB/gt lt/Nodegt ltNode nameBgtlt/Nodegt ... lt/
docgt
Node
EdgeTo
A
B
B
  • Use primary key to distinguish doc?Node edges
  • Use foreign key to perform join (EdgeTo.ref
    Node.name)

6
Disjunctions in XML, 1st Case
ltNode nameAgt ltEdgeTo refB/gt ltDisjgt ltEdge
To refC/gt ltEdgeTo refD/gt lt/Disjgt lt/Nodegt
...
B
A
C
D
but how do we represent a disjunctive tail?
7
Disjunctions in XML, 1st Case
ltNode nameAgt ltEdgeTo refB/gt ltDisjgt ltEdge
To refC/gt ltEdgeTo refD/gt lt/Disjgt lt/Nodegt
ltDisjgt ltNode nameEgt ltEdgeTo
refG/gt ltEdgeTo refH/gt lt/Nodegt ltNode
nameFgt ltEdgeTo refG/gt ltEdgeTo
refH/gt lt/Nodegt lt/Disjgt ...
or
8
Disjunctions in XML, 2nd Case
ltDisjgt ltTailgt ltNode nameE/gt ltNode
nameF/gt lt/Tailgt ltHeadgt ltEdgeTo
refG/gt ltEdgeTo refH/gt lt/Headgt lt/Disjgt ...
E
G
F
H
What if the disjunction isnt the full
cross-product?
9
Disjunctions in XML, 3rd Case
ltDisjgt ltTailgt ltNode nameI/gt lt/Tailgt ltHeadgt
ltEdgeTo refK/gt lt/Headgt ltTailgt ltNode
nameJ/gt lt/Tailgt ltHeadgt ltEdgeTo
refK/gt ltEdgeTo refL/gt lt/Headgt lt/Disjgt ...
10
Time and Space Complexity
  • n of nodes in DOM tree
  • counts edges as well
  • not necessarily proportional to of values in
    the database
  • Ordinary XML traverse tree, add edges.
    Distinguish records with primary keys, add edges
    for foreign keys. O(n) time, O(n) space.

11
Time and Space Complexity
  • ltDisjgt same, except only one edge to all
    children. O(n), O(n).
  • ltDisjgt with ltTailgt and ltHeadgt traverse tree,
    add ltTailgt and ltHeadgt elements to a list, add one
    edge, repeat for each Tail/Head pair. O(n), O(n).

12
Summary
  • We need to introduce new XML constructs
  • ltDisjgt
  • Helper constructs ltTailgt and ltHeadgt
  • Three cases
  • simple tail, compound head
  • full cross-product
  • partial cross-product
  • Time and space requirements consistent with the
    transitive closure algorithm

13
Future Work
  • Solving path queries
  • Adding XML constructs for more complicated
    disjunctions
  • e.g. Tail (A or B), Head ((C and D) or E)
  • Determining frequency of disjunctive data in
    real-world data
  • Developing a normal form for disjunctive XML
  • Minimize redundancy
  • Minimize disjunctive tails
Write a Comment
User Comments (0)
About PowerShow.com