PowerPointPrsentation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

PowerPointPrsentation

Description:

1. Schema-based Scheduling of Event Processors. and Buffer ... Stefanie Scherzinger. joint work with Christoph Koch, Nicole Schweikardt, and Bernhard Stegmaier ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 26
Provided by: Steff
Category:

less

Transcript and Presenter's Notes

Title: PowerPointPrsentation


1
Schema-based Scheduling of Event Processors and
Buffer Minimization for Queries on Structured
Data Streams FluXQuery An Optimizing XQuery
Processorfor Streaming XML Data
Stefanie Scherzinger joint work with Christoph
Koch, Nicole Schweikardt, and Bernhard Stegmaier
2
XML Streams
  • ?Very large XML documents.
  • ?Schema information provided with the data.
  • ?Main-memory based applications.

3
Queries on XML Streams
  • 1. Boolean or node-selecting queries XPath ?
    state-of-the-art techniques use little memory
  • 2. Transformations XQuery, XSLT ? excessive
    memory consumption

4
Classical XQuery Evaluation
Bibliography DTD lt!ELEMENT bib (book)gt lt!ELEMENT
book (titleauthorprice)gt
List title(s) and authors of books ltresultsgt
for b in /bib/book return ltresultgt
b/title b/author lt/resultgt lt/resultsgt
?buffer titles and authors!
Example
Buffer ltauthorgtKemperlt/authorgt lttitlegtDatenbanks
ystemelt/titlegt ltauthorgtEicklerlt/authorgt
ltbookgt ltauthorgtKemperlt/authorgt
lttitlegtDatenbanksystemelt/titlegt
ltauthorgtEicklerlt/authorgt ltpricegt40lt/pricegt
lt/bookgt
Output ltresultgt lttitlegtDatenbanksystemelt/titlegt
ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/authorgt
lt/resultgt
5
The FluXQuery-Approach (1)
Bibliography DTD lt!ELEMENT bib (book)gt lt!ELEMENT
book (titleauthorprice)gt
FluX query (for book node) ltresultgt
process-stream b on title as t return t
on-first past (title,author) return
for a in b/author return a
lt/resultgt
List title(s) and authors of books ltresultsgt
for b in /bib/book return ltresultgt
b/title b/author lt/resultgt lt/resultsgt
  • Less buffering than inconventional evaluation

Example
Buffer ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/
authorgt
ltbookgt ltauthorgtKemperlt/authorgt
lttitlegtDatenbanksystemelt/titlegt
ltauthorgtEicklerlt/authorgt ltpricegt40lt/pricegt
lt/bookgt
Output ltresultgt lttitlegtDatenbanksystemelt/titlegt
ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/authorgt
lt/resultgt
6
The FluXQuery-Approach (2)
Bibliography DTD lt!ELEMENT bib (book)gt lt!ELEMENT
book ((titleauthor),price)gt
FluX query (for book node) ltresultgt
process-stream b on title as t return t
on-first past (title,author) return
for a in b/author return a
lt/resultgt
List title(s) and authors of books ltresultsgt
for b in /bib/book return ltresultgt
b/title b/author lt/resultgt lt/resultsgt
  • Flush buffers early!

Example
Buffer ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/
authorgt
ltbookgt ltauthorgtKemperlt/authorgt
lttitlegtDatenbanksystemelt/titlegt
ltauthorgtEicklerlt/authorgt ltpricegt40lt/pricegt
lt/bookgt
Output ltresultgt lttitlegtDatenbanksystemelt/titlegt
ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/authorgt
lt/resultgt
7
The FluXQuery-Approach (3)
Bibliography DTD lt!ELEMENT bib (book)gt lt!ELEMENT
book (title,author,price)gt
FluX query ltresultgt process-stream b on
title as t return t on author as a return
a lt/resultgt
List title(s) and authors of books ltresultsgt
for b in /bib/book return ltresultgt
b/title b/author lt/resultgt lt/resultsgt
? No buffering!
Example
Buffer
ltbookgt lttitlegtDatenbanksystemelt/titlegt
ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/authorgt
ltpricegt40lt/pricegt lt/bookgt
Output ltresultgt lttitlegtDatenbanksystemelt/titlegt
ltauthorgtKemperlt/authorgt ltauthorgtEicklerlt/authorgt
lt/resultgt
8
Whats next?
  • The XQuery Fragment
  • FluX Query Language
  • Translating XQuery into FluX
  • Experiments

9
XQuery-, an XQuery Fragment
  • Contains...
  • arbitrarily nested for-loops,
  • where-conditions,
  • if-statements,
  • joins
  • Does not contain...
  • and // in paths
  • aggregation
  • let-constructs

10
Simple XQuery- Expressions
  • XQuery- expression is simple ?Can be executed
    without buffering the stream

Example 1
ltagt x lt/agtif x/b 5 then ltbgt5lt/bgt
simple
Example 2
x x
not simple
11
FluX Query Language
  • FluX expressions
  • simple XQuery- expression
  • string process-stream y H string
  • Event handlers H
  • on-first past(S) return a
  • a XQuery- expression
  • S set of symbols
  • on a as x return Q
  • a symbol name
  • x variable
  • Q FluX expression

a executed on buffers
Q executed in event-based fashion
12
Example
FluX query (for book node) ltresultgt
process-stream b on title as t return t
on-first past (title,author) return
for a in b/author return a
lt/resultgt
13
Safe FluX Queries
  • FluX query is safe ? No XQuery- expression
    refers to elements that may still be encountered
    in the stream

Bibliography DTD lt!ELEMENT bib (book)gt lt!ELEMENT
book ((titleauthor), price)gt
FluX query ltresultgt process-stream b on
title as t return t on-first past
(title,author) return for p in
b/price return p lt/resultgt
Data stream ltbookgt ltauthorgtKemperlt/authorgt
lttitlegtDatenbanksystemelt/titlegt
ltauthorgtEicklerlt/authorgt ltpricegt40lt/pricegt
lt/bookgt
execute
Not safe!
14
Safe FluX Queries
  • FluX query is safe ? No XQuery- expression
    refers to elements that may still be encountered
    in the stream

Bibliography DTD lt!ELEMENT bib (book)gt lt!ELEMENT
book ((titleauthor), price)gt
FluX query ltresultgt process-stream b on
title as t return t on-first past
(title,author, price)
return for p in b/price return
p lt/resultgt
Data stream ltbookgt ltauthorgtKemperlt/authorgt
lttitlegtDatenbanksystemelt/titlegt
ltauthorgtEicklerlt/authorgt ltpricegt40lt/pricegt
lt/bookgt
execute
Safe!
15
XQuery to FluX
  • Normalize XQuery- Q into Q
  • Rewrite norm. XQuery- Q to FluX query F using
    order constraints from DTD
  • F is safe w.r.t. DTD
  • F is equivalent to Q
  • F has low memory consumption

16
Experiments
  • Based on XMark
  • Queries adapted to XQuery- fragment
  • Environment
  • AMD Athlon XP 2000, 512MB RAM
  • Linux, Sun JDK 1.4.2_03
  • Measurements
  • Execution time
  • Memory consumption

17
Experiments with XMark
18
Intermediary Summary
  • FluXQuery engine supports
  • powerful fragment of XQuery
  • arbitrarily nested for-loops
  • and joins
  • event-based query processing
  • conscious handling of main-memory buffers
  • algebraic optimization based on schema
    information

19
for b in /book return b/publisher/name
b/publisher/address
lt!ELEMENT bib (book)gt lt!ELEMENT book
(title,author,publisher)gt lt!ELEMENT publisher
(name, address)gt
no buffering necessary!
20
for b in /book return b/publisher/name
b/publisher/address
normalize
for b in /book return for p in
b/publisher return for n in p/name
return n for p
in b/publisher return for a in
p/address return a

loop twice over all publishers,ergo buffer!
21
Algebraic Optimization
  • ?in translation, exploit order constraints
  • ?can also exploit cardinality constraints
  • e.g. merging for-loops

22
for b in /book return b/publisher/name
b/publisher/address
1. normalize
2. algebraic optimization
for b in /book return for p in
b/publisher return for n in p/name
return n for a
in p/address return a

for b in /book return for p in
b/publisher return for n in p/name
return n for q
in b/publisher return for a in
p/address return a

no need to buffer
23
Future Work
  • Increase XQuery fragment
  • Add aggregation,
  • Add - and //-paths ?allow recursive DTDs
  • Extend algebraic optimizations
  • Optimize backend of query engine

24
Related Work
  • Altinel, Franklin. Efficient Filtering of XML
    Documents for Selective Dissemination of
    Information. VLDB 2000
  • Buneman, Grohe, Koch. Path Queries on Compressed
    XML. VLDB 2003
  • Chan, Felber, Garofalakis, Rastogi. Efficient
    Filtering of XML Documents with XPath
    Expressions. ICDE 2002
  • Deutsch, Tannen. Reformulation of XML Queries
    and Constraints. ICDT 2003
  • Fegaras, Levine, Bose, Chaluvadi. Query
    Processing on Streamed XML Data. CIKM 2002
  • Green, Miklau, Onizuka, Suciu. Processing XML
    Streams with Deterministic Automata. ICDT 2003
  • Gupta, Suciu. Stream Processing of XPath Queries
    with Predicates. SIGMOD 2003
  • Ludäscher, Mukhopadhyay, Papakonstantinou. A
    Transducer-Based XML Query Processor. VLDB 2002
  • Marian, Siméon. Projecting XML Documents. VLDB
    2003
  • Olteanu, Kiesling, Bry. An Evaluation of Regular
    Path Expressions with Qualifiers against XML
    Streams. ICDE 2003

25
lt/thanksgt
Write a Comment
User Comments (0)
About PowerShow.com