Stefanie Scherzinger Universitt Passau - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Stefanie Scherzinger Universitt Passau

Description:

Stefanie Scherzinger. Christoph Koch and Stefanie Scherzinger ... L. Fegaras, D. Levine, S. Bose, and. V. Chaluvadi. ' Query Processing of Streamed XML Data' ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 29
Provided by: Steff
Category:

less

Transcript and Presenter's Notes

Title: Stefanie Scherzinger Universitt Passau


1
DBPL 2003 Attribute Grammars for Scalable Query
Processing on XML Streams
Christoph KochUniversity of Edinburgh
  • Stefanie ScherzingerUniversität Passau

2
Querying XML
  • XPath//bookyear2003/title
  • Node selecting/boolean queries, no data
    transformations
  • Buffers necessary
  • XML Query
  • ltbooksgt
  • for x in input()//book
  • where x/year2003
  • return
  • ltbookgt
  • x/title
  • ltauthorsgt
  • x/author
  • lt/authorsgt
  • lt/bookgt
  • lt/booksgt
  • Buffers necessary

3
Requirements for Scalable Query Processing on XML
Streams
  • Evaluation in linear time in size of the input
  • One linear forward scan of the data
  • Bounded memory consumption
  • independent of the length of the stream
  • depending on depth of document

4
XML-DPDT
  • DPDA with output
  • Rejects malformed documents
  • Restricted stack discipline
  • push on seeing opening tag lttgt
  • pop on seeing closing tag lt/tgt
  • ? Size of stack bounded by maximum depth of the
    incoming document tree

?
5
Ease of Use?
  • DPDA T (Q,?,?, ?,q0,Z0)
  • ?(q0,ltbibgt, Z0) (bib0, (q0,bib))
  • ?(bib0,ltbookgt, X) (book0, (bib1,book))
  • ?(book0,ltyeargt, X) (year0, (book1,year))
  • ?(year0,lt/yeargt, (book1,year)) (book1,?)
  • ?(book1,lttitlegt, X) (title0, (book2,title))
  • ?(title0,lt/titlegt, (book2,title)) (book2,?)
  • ?(book2,ltauthorgt, X) (author0, (book3,author))
  • ?(book3,ltauthorgt, X) (author0, (book4,author))
  • ?(book4,ltauthorgt, X) (author0, (book4,author))
  • ?(book4,lt/bookgt, (bib1,book)) (bib1, ?)
  • ?(book3,lt/bookgt, (bib1,book)) (bib1, ?)
  • ?(author0,lt/authorgt, (book3,author)) (book3,?)
  • ?(author0,lt/authorgt, (book4,author)) (book4,?)
  • . .

6
Our Aim
  • Query formalism which
  • meets requirements for scalable stream
    processing, i.e. has expressive power of
    XML-DPDTs.
  • is natural and easy to use.
  • does not allow specification of queries that
    cannot be evaluated scalably.
  • Our solution XSAGs

7
XML Stream Attribute Grammars (XSAGs)
  • Query language for XML streams
  • Data transformations
  • Scalable evaluation

8
Extended Regular Tree Grammars
Grammar G (Nt,T,P,bib) Nonterminals Nt
bib,book,year,title,author Terminals T
bib,book,year,title,author,PCDATA
bib bib( book ) book book(
year.title.author.author ) year year(
PCDATA ) title title( PCDATA ) author
author( PCDATA )
? L(G)
9
Basic XSAGs (bXSAGs)
  • Basic XSAG based onTDLL(1) Grammar
  • Attribution functions
  • n t(?)
  • n fI t(?)
  • n t(?) fII
  • n fI t(?) fII
  • Regular expression ?(?) is one-unambiguous
  • book1 book( ? )
  • book2 book( ? )
  • ?( ( book1 ? book2 ) ) ( book ? book )?
  • ?( book1 . book2 ) book.book ?
  • ?( book1 . book2 ) book. book ?
  • can be parsed with a lookahead of one symbol
  • all DTDs are TDLL(1)!

10
Example Rename Root Node
bib printltbooksgt bib( book )
printlt/booksgt book ECHO book(
year.title.author.author ) year year(
PCDATA ) title title( PCDATA ) author
author( PCDATA )
printltbooksgt
printlt/booksgt
ECHO
11
Propagation of Attributions
XSAG attribute
Stack
12
Propagation of Attributions
out1 fI(in1)
bib
Stack
13
Propagation of Attributions
book
bib
Stack
out1 fI(in1)
14
Propagation of Attributions
15
Propagation of Attributions
book
bib
Stack
out2 fII(in1,in2)
16
Propagation of Attributions
out2 fII(in1,in2)
bib
Stack
17
Propagation of Attributions
result Output
Stack
18
Grouping Sibling Nodes
no ltauthorgt anymore? print lt/authorsgt
first ltauthorgt seen? print ltauthorsgt
19
bXSAG Grouping Sibling Nodes
bib printltbooksgt bib( book )
printlt/booksgt book ECHO book(
year.title.author.author ) printlt/authorsgt
out1.flagoff year
year( PCDATA ) title title( PCDATA
) author if ( in1.flagoff ) author(
PCDATA ) then begin
out1.flagon
print ltauthorsgt end
20
yXSAG Grouping Sibling Nodes
attribution functions within regular expression!
bib printltbooksgt bib( book )
printlt/booksgt book ECHO book(
year.title. ( printltauthorsgt
(author.author) printlt/authorsgt ) )
year year( PCDATA ) title title(
PCDATA ) author author( PCDATA )

21
Parse Tree for yXSAGs
printltbooksgt
printlt/booksgt
ECHO
printltauthorsgt
printlt/authorsgt
22
easy XSAG Grammars
  • Easy XSAG based on STDLL(1) Grammar
  • Attribution functions
  • n t(?)
  • n fI t(?)
  • n t(?) fII
  • n fI t(?) fII
  • attributed regular expression ?
  • ?(?) is strongly one-unambiguous
  • editor ? author ?
  • editor.editor ? author ?
  • can be parsed with a lookahead of one symbol
  • only one way to derive empty word ?

23
Conditional Output (yXSAG)Boolean Function
MATCH_CHILDREN
bib printltbooksgt bib( book )
printlt/booksgt book book( (
MATCH_CHILDREN(2003,c) year ).
( if ( in1.ctrue )
then begin printltbookgt
ECHO end (title.
author.author) ) ) if (in2.ctrue)
then print lt/bookgt year year( PCDATA )
title title( PCDATA ) author
author( PCDATA )
24
Possible Queries Dependon the Underlying Grammar
cannot select books on year
bib bib( book ) book book(
title.author.author. year ) year year(
PCDATA ) title title( PCDATA ) author
author( PCDATA )
25
bXSAG vs. yXSAG
  • Grammar contained (DTD) ?TDLL(1)? bXSAG
    STDLL(1)?yXSAG
  • User-friendly queries
  • ExpressivenessXML-DPDTs, basic XSAGs, easy
    XSAGsshare the same expressive power
  • Efficiency
  • space Stack of size O( depth(Stream) )
  • time O(f(XSAG) Stream ), f(XSAG) is
    O(2attributes)or O( XSAG2 Stream ?
    XSAG ) for bXSAG O( XSAG3 Stream ?
    XSAG ) for yXSAG

26
Conclusion Future Work
  • XSAGs meet the requirements for scalable XML
    query processing
  • XSAGs have a well-justified foundation
  • XSAGs are user-friendly
  • underlying grammar guides the user
  • macros for typical tasks
  • common queries can be quickly and easily stated
  • Current Statusprototype implementation
  • Future Work
  • Java-code in attribution functions
  • Process XML Query with XSAGs.

27
The END.
28
Related Work
  • Queries on XML Streams
  • L. Fegaras, D. Levine, S. Bose, and V.
    Chaluvadi. Query Processing of Streamed XML
    Data. CIKM, 2002.
  • T. J. Green, G. Miklau, M. Onizuka, and D.
    Suciu. Processing XML Streams with Deterministic
    Automata. ICDT03, 2003.
  • B. Ludäscher, P. Mukhopadhyay, and Y.
    Papakonstantinou. A Transducer-Based XML Query
    Processor. VLDB02, 2002.
  • D. Olteanu, T. Kiesling, and F. Bry. An
    Evaluation of Regular Path Expressions with
    Qualifiers against XML Streams. ICDE, 2003.
    Poster Session.
  • One-unambiguous Regular Languages
  • Brüggemann-Klein and D. Wood. One-Unambiguous
    Regular Languages. Information and Computation,
    1998.
  • XML and Attribute Grammars
  • M. Benedikt, C.Y. Chang, W. Fan, J. Freire, and
    R. Rastogi. Capturing both Types and Constraints
    in Data Integration. SIGMOD03, 2003.
  • M. Benedikt, C.Y. Chan, W. Fan, R. Rastogi, S.
    Zhen, and A. Zhou. DTD-Directed Publishing with
    Attribute Translation Grammars. VLDB02, 2002.
  • F. Neven and J. van de Bussche. Expressiveness
    of Structured Document Query Languages Based on
    Attribute Grammars. JACM, Jan. 2002.
  • TDLL(1) Grammars
  • D. Lee, M. Mani, and M. Murata. Reasoning about
    XML Schema Languages using Formal Language
    Theory. Technical Report RJ 10197 Log 95071, IBM
    Research, Nov. 2000.
Write a Comment
User Comments (0)
About PowerShow.com