XML Technologies and Applications - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

XML Technologies and Applications

Description:

Title: PowerPoint Presentation Last modified by: raj Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 24
Provided by: tinmanCs6
Category:

less

Transcript and Presenter's Notes

Title: XML Technologies and Applications


1
XML Technologies and Applications
Rajshekhar SunderramanDepartment of Computer
Science Georgia State University Atlanta, GA
30302raj_at_cs.gsu.edu V (c). XML Querying
XQuery December 2005
2
Outline
  • Introduction
  • XML Basics
  • XML Structural Constraint Specification
  • Document Type Definitions (DTDs)
  • XML Schema
  • XML/Database Mappings
  • XML Parsing APIs
  • Simple API for XML (SAX)
  • Document Object Model (DOM)
  • XML Querying and Transformation
  • XPath
  • XSLT
  • XQuery
  • XML Applications

3
XQuery XML Query Language
  • Integrates XPath with earlier proposed query
    languages XQL, XML-QL
  • SQL-style, not functional-style
  • Much easier to use as a query language than XSLT
  • Can do pretty much the same things as XSLT and
    more, but typically easier
  • 2004 XQuery 1.0

4
transcript.xml
  • ltTranscriptsgt
  • ltTranscriptgt
  • ltStudent StudId111111111 NameJohn Doe/gt
  • ltCrsTaken CrsCodeCS308 SemesterF1997
    GradeB/gt
  • ltCrsTaken CrsCodeMAT123 SemesterF1997
    GradeB/gt
  • ltCrsTaken CrsCodeEE101 SemesterF1997
    GradeA/gt
  • ltCrsTaken CrsCodeCS305 SemesterF1995
    GradeA/gt
  • lt/Transcriptgt
  • ltTranscriptgt
  • ltStudent StudId987654321 NameBart
    Simpson /gt
  • ltCrsTaken CrsCodeCS305 SemesterF1995
    GradeC/gt
  • ltCrsTaken CrsCodeCS308 SemesterF1994
    GradeB/gt
  • lt/Transcriptgt
  • contd

5
transcript.xml (contd)
  • ltTranscriptgt
  • ltStudent StudId123454321 NameJoe Blow
    /gt
  • ltCrsTaken CrsCodeCS315 SemesterS1997
    GradeA /gt
  • ltCrsTaken CrsCodeCS305 SemesterS1996
    GradeA /gt
  • ltCrsTaken CrsCodeMAT123 SemesterS1996
    GradeC /gt
  • lt/Transcriptgt
  • ltTranscriptgt
  • ltStudent StudId023456789 NameHomer
    Simpson /gt
  • ltCrsTaken CrsCodeEE101 SemesterF1995
    GradeB /gt
  • ltCrsTaken CrsCodeCS305 SemesterS1996
    GradeA /gt
  • lt/Transcriptgt
  • lt/Transcriptsgt

6
XQuery Basics
  • General structure (FLWR expressions)
  • FOR variable declarations
  • LET variable expression,
  • variable expression,
  • WHERE condition
  • RETURN document
  • Example
  • ( students who took MAT123 )
  • FOR t IN doc(http//xyz.edu/transcript.xml)/
    /Transcript
  • WHERE t/CrsTaken/_at_CrsCode MAT123
  • RETURN t/Student
  • Result
  • ltStudent StudId111111111 NameJohn Doe /gt
  • ltStudent StudId123454321 NameJoe Blow /gt

XQuery expression
comment
7
XQuery Basics (contd)
  • Previous query doesnt produce a well-formed XML
    document the following does
  • ltStudentListgt
  • FOR t IN doc(transcript.xml)//Transcript
  • WHERE t/CrsTaken/_at_CrsCode MAT123
  • RETURN t/Student
  • lt/StudentListgt
  • FOR binds t to Transcript elements one by one,
    filters using WHERE, then places Student-children
    as e-children of StudentList using RETURN

Query inside XML
8
FOR vs LET
For iteration
FOR x IN doc(transcript.xml) RETURN ltresultgt
x lt/resultgt
Returns ltresultgt lttranscriptgt...lt/transcriptgtlt/
resultgt ltresultgt lttranscriptgt...lt/transcriptgtlt/re
sultgt ltresultgt lttranscriptgt...lt/transcriptgtlt/resu
ltgt ...
LET x doc(transcript.xml) RETURN ltresultgt
x lt/resultgt
Let set value is assigned to variable.
Returns ltresultgt lttranscriptgt...lt/transcriptgt
lttranscriptgt...lt/transcriptgt
lttranscriptgt...lt/transcriptgt
... lt/resultgt
9
Document Restructuring with XQuery
  • Reconstruct lists of students taking each class
    using the Transcript records
  • FOR c IN distinct values(doc(transcript.xml)//
    CrsTaken)
  • RETURN
  • ltClassRoster CrsCodec/_at_CrsCode
    Semesterc/_at_Semestergt
  • FOR t IN doc(transcript.xml)//Transcript
  • WHERE t/CrsTaken/_at_CrsCode c/_at_CrsCode and
  • _at_Semester c/_at_Semester
  • RETURN t/Student
  • ORDER BY t/Student/_at_StudId
  • lt/ClassRostergt
  • ORDER BY c/_at_CrsCode

Query inside RETURN similar to query inside
SELECT in OQL
10
Document Restructuring (contd)
  • Output elements have the form
  • ltClassRoster CrsCodeCS305 SemesterF1995gt
  • ltStudent StudId111111111 NameJohn
    Doe/gt
  • ltStudent StudId987654321 NameBart
    Simpson/gt
  • lt/ClassRostergt
  • Problem the above element will be output twice
    for each of the following two bindings of c
  • ltCrsTaken CrsCodeCS305 SemesterF1995
    GradeC/gt
  • ltCrsTaken CrsCodeCS305 SemesterF1995
    GradeA/gt
  • Note grades are different distinct-values( )
    wont eliminate transcript records that refer to
    same class!

Bart Simpsons
John Does
11
Document Restructuring (contd)
  • Solution instead of
  • FOR c IN distinct-values(doc(transcript.xml)
    //CrsTaken)
  • use
  • FOR c IN doc(classes.xml)//Class
  • where classes.xml lists course offerings
    (course code/semester)
  • explicitly (no need to extract them from
    transcript records) shown on
  • next slide
  • Then c is bound to each class exactly once, so
    each class roster
  • will be output exactly once

12
http//xyz.edu/classes.xml
  • ltClassesgt
  • ltClass CrsCodeCS308 SemesterF1997 gt
  • ltCrsNamegtSElt/CrsNamegt ltInstructorgtAdrian
    Joneslt/Instructorgt
  • lt/Classgt
  • ltClass CrsCodeEE101 SemesterF1995 gt
  • ltCrsNamegtCircuitslt/CrsNamegt ltInstructorgtDavid
    Joneslt/Instructorgt
  • lt/Classgt
  • ltClass CrsCodeCS305 SemesterF1995 gt
  • ltCrsNamegtDatabaseslt/CrsNamegt ltInstructorgtMary
    Doelt/Instructorgt
  • lt/Classgt
  • ltClass CrsCodeCS315 SemesterS1997 gt
  • ltCrsNamegtTPlt/CrsNamegt ltInstructorgtJohn
    Smythlt/Instructorgt
  • lt/Classgt
  • ltClass CrsCodeMAR123 SemesterF1997 gt
  • ltCrsNamegtAlgebralt/CrsNamegt ltInstructorgtAnn
    Whitelt/Instructorgt
  • lt/Classgt
  • lt/Classesgt

13
Document Restructuring (contd)
  • More problems the above query will list classes
    with no students. Reformulation that avoids this
  • FOR c IN doc(classes.xml)//Class
  • WHERE
  • doc(transcripts.xml)//CrsTaken_at_CrsCode
    c/_at_CrsCode
  • and _at_Semester
    c/_at_Semester
  • RETURN
  • ltClassRoster CrsCodec/_at_CrsCode
    Semesterc/_at_Semestergt
  • FOR t IN doc(transcript.xml)//Transcript
  • WHERE t/CrsTaken_at_CrsCode c/_at_CrsCode and
  • _at_Semester c/_at_Semester
  • RETURN t/Student
  • ORDER BY t/Student/_at_StudId
  • lt/ClassRostergt
  • ORDER BY c/_at_CrsCode

Test that classes arent empty
14
XQuery Semantics
  • So far the discussion was informal
  • XQuery semantics defines what the expected result
    of a query is
  • Defined analogously to the semantics of SQL

15
XQuery Semantics (contd)
  • Step 1 Produce a list of bindings for variables
  • The FOR clause binds each variable to a list of
    nodes specified by an XQuery expression.
  • The expression can be
  • An XPath expression
  • An XQuery query
  • A function that returns a list of nodes
  • End result of a FOR clause
  • Ordered list of tuples of document nodes
  • Each tuple is a binding for the variables in the
    FOR clause

16
XQuery Semantics (contd)
  • Example (bindings)
  • Let FOR declare A and B
  • Bind A to document nodes v,w B to x,y,z
  • Then FOR clause produces the following list of
    bindings for A and B
  • A/v, B/x
  • A/v, B/y
  • A/v, B/z
  • A/w, B/x
  • A/w, B/y
  • A/w, B/z

17
XQuery Semantics (contd)
  • Step 2 filter the bindings via the WHERE clause
  • Use each tuple binding to substitute its
    components for variables retain those bindings
    that make WHERE true
  • Example WHERE A/CrsTaken/_at_CrsCode
    B/Class/_at_CrsCode
  • Binding A/w, where w ltCrsTaken
    CrsCodeCS308 /gt
  • B/x, where x ltClass CrsCodeCS308
    /gt
  • Then w/CrsTaken/_at_CrsCode x/Class/_at_CrsCode, so
    the WHERE condition is satisfied binding
    retained

18
XQuery Semantics (contd)
  • Step 3 Construct result
  • For each retained tuple of bindings, instantiate
    the RETURN clause
  • This creates a fragment of the output document
  • Do this for each retained tuple of bindings in
    sequence

19
Grouping and Aggregation
  • Does not use separate grouping operator
  • OQL does not need one either (XML data model is
    object-oriented and hence similarities with OQL)
  • Subqueries inside the RETURN clause obviate this
    need (like subqueries inside SELECT did so in
    OQL)
  • Uses built-in aggregate functions count, avg,
    sum, etc. (some borrowed from XPath)

20
Aggregation Example
  • Produce a list of students along with the number
    of courses each student took
  • FOR t IN fndoc(transcripts.xml)//Transc
    ript,
  • s IN t/Student
  • LET c t/CrsTaken
  • RETURN
  • ltStudentSummary
  • StudId s/_at_StudId
  • Name s/_at_Name
  • TotalCourses fncount(fndistinct-valu
    es(c)) /gt
  • ORDER BY StudentSummary/_at_TotalCourses
  • The grouping effect is achieved because c is
    bound to a new set of nodes for each binding of t

21
Quantification in XQuery
  • XQuery supports explicit quantification
  • SOME (?) and EVERY (?)
  • Example Find students who have taken MAT123.
  • FOR t IN fndoc(transcript.xml)//Transcript
  • WHERE SOME ct IN t/CrsTaken
  • SATISFIES ct/_at_CrsCode MAT123
  • RETURN t/Student

22
Quantification (contd)
  • Retrieve all classes (from classes.xml) where
    each student took the class.
  • FOR c IN fndoc(classes.xml)//Class
  • LET g
  • ( Transcript records that correspond to class
    c )
  • FOR t IN fndoc(transcript.xml)//Transcript
  • WHERE t/CrsTaken/_at_Semester c/_at_Semester AND
  • t/CrsTaken/_at_CrsCode c/_at_CrsCode
  • RETURN t
  • h FOR s in fndoc(transcript.xml)//Transc
    ript
  • RETURN s ( all transcript records )
  • WHERE EVERY tr IN h SATISFIES
  • tr IN g
  • RETURN c ORDER BY c/_at_CrsCode

23
XQuery Summary
  • FOR-LET-WHERE-RETURN FLWR

FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
RETURN Clause
Instance of Xquery data model
Write a Comment
User Comments (0)
About PowerShow.com