XML Language Family Detailed Examples - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

XML Language Family Detailed Examples

Description:

These s are intended to be used as a tutorial on XML and related technologies ... non-XML syntax to facilitate use of XPath within URIs and XML attribute values. ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 23
Provided by: RenateMo5
Category:

less

Transcript and Presenter's Notes

Title: XML Language Family Detailed Examples


1
XML Language FamilyDetailed Examples
  • Most information contained in these slide comes
    from http//www.w3.org, http//www.zvon.org/
  • These slides are intended to be used as a
    tutorial on XML and related technologies
  • Slide authorJürgen Mangler (juergen.mangler_at_univ
    ie.ac.at)
  • This section contains examples on
  • XPath,
  • XPointer

2
XPath is the result of an effort to provide a
common syntax and semantics for functionality
shared between XSL Transformations XSLT and
XPointer. The primary purpose of XPath is to
address parts of an XML document.
  • XPath uses a compact, non-XML syntax to
    facilitate use of XPath within URIs and XML
    attribute values.
  • XPath operates on the abstract, logical structure
    of an XML document, rather than its surface
    syntax.
  • XPath gets its name from its use of a path
    notation as in URLs for navigating through the
    hierarchical structure of an XML document.

3
  • In addition to its use for addressing, XPath is
    also designed to feature a natural subset that
    can be used for matching (testing whether or not
    a node matches a pattern) this use of XPath is
    described in XSLT (next chapter).
  • XPath models an XML document as a tree of nodes.
    There are different types of nodes, including
    element nodes, attribute nodes and text nodes.

4
  • The basic XPath syntax is similar to filesystem
    addressing. If the path starts with the slash / ,
    then it represents an absolute path to the
    required element.

/AAA/CCC Select all elements CCC which are
children of the root element AAA       ltAAAgt
          ltBBB/gt           ltCCC/gt
          ltBBB/gt           ltBBB/gt
          ltDDDgt                ltBBB/gt
          lt/DDDgt           ltCCC/gt      lt/AAAgt
/AAA Select the root element AAA       ltAAAgt
          ltBBB/gt           ltCCC/gt
          ltBBB/gt           ltBBB/gt
          ltDDDgt                ltBBB/gt
          lt/DDDgt           ltCCC/gt      lt/AAAgt
5
  • If the path starts with // then all elements in
    the document, that fulfill the criteria following
    //, are selected.

//DDD/BBB Select all elements BBB which are
children of DDD       ltAAAgt           ltBBB/gt
          ltDDDgt                ltBBB/gt
          lt/DDDgt           ltCCCgt
               ltDDDgt                     ltBBB/gt
                    ltBBB/gt                lt/DDDgt
          lt/CCCgt      lt/AAAgt
//BBB Select all elements BBB        ltAAAgt
          ltBBB/gt           ltDDDgt
               ltBBB/gt           lt/DDDgt
          ltCCCgt                ltDDDgt
                    ltBBB/gt                     lt
BBB/gt                lt/DDDgt           lt/CCCgt
     lt/AAAgt
6
  • The star selects all elements located by the
    preceeding path

////BBB Select all elements BBB which have 3
ancestors      ltAAAgt            ltCCCgt
               ltDDDgt                     ltBBB/gt
               lt/DDDgt           lt/CCCgt
          ltCCCgt                ltDDDgt
                    ltBBB/gt                lt/DDDgt
          lt/CCCgt      lt/AAAgt
/AAA/CCC/DDD/ Select all elements enclosed by
elements /AAA/CCC/DDD       ltAAAgt
          ltBBB/gt           ltDDDgt
               ltBBB/gt           lt/DDDgt
          ltCCCgt                ltDDDgt
                    ltBBB/gt                     lt
BBB/gt                lt/DDDgt           lt/CCCgt
     lt/AAAgt
7
  • The expression in square brackets can further
    specify an element. A number in the brackets
    gives the position of the element in the selected
    set. The function last() selects the last element
    in the selection.

/papers/paperlast() Select the last BBB child
of element AAA       ltpapersgt           ltpaper
author"motschnig"/gt           ltpaper
author"derntl"/gt           ltpaper
author"motschnig"/gt           ltpaper
author"mangler"gt      lt/papersgt
/papers/paper1 Select the first BBB child of
element AAA       ltpapersgt           ltpaper
author"motschnig"gt           ltpaper
author"derntl"/gt           ltpaper
author"motschnig"/gt           ltpaper
author"mangler"/gt      lt/papersgt
8
  • Attributes are specified by _at_ prefix.

//student_at_matnr Select BBB elements which have
attribute id      ltstudentsgt           ltstudent
matnr"9506264"/gt           ltstudent
matnr"0002843"/gt           ltstudent
name"Hauer"/gt           ltstudent/gt
     lt/studentsgt
//_at_matnr Select all attributes _at_matnr      ltstud
entsgt           ltstudent matnr"9506264"/gt
          ltstudent matnr"0002843"/gt
          ltstudent name"Hauer"/gt
          ltstudent/gt      lt/studentsgt
//studentnot(_at_) Select BBB elements without an
attribute      ltstudentsgt           ltstudent
id"9506264"/gt           ltstudent id"0002843"/gt
          ltstudent name"Koegler"/gt
          ltstudent/gt      lt/studentsgt
//student_at_ Select BBB elements which have any
attribute      ltstudentsgt           ltstudent
id"9506264"/gt           ltstudent id"0002843"/gt
          ltstudent name"Hauer"/gt
          ltstudent/gt      lt/studentsgt
9
  • Values of attributes can be used as selection
    criteria. Function normalize-space removes
    leading and trailing spaces and replaces
    sequences of whitespace characters by a single
    space.

//studentnormalize-space(_at_name)'hauer' Select
BBB elements which have an attribute name with
value bbb, leading and trailing spaces are
removed before comparison        ltstudentsgt
          ltstudent matnr"9506264"/gt
          ltstudent name" hauer "/gt
          ltstudent name"hauer"/gt
     lt/studentsgt
//student_at_id'b1' Select BBB elements which
have attribute id with value b1       ltstudentsgt
          ltBBB matnr"9506264"/gt
          ltBBB name" hauer "/gt           ltBBB
name"hauer"/gt      lt/studentsgt
10
  • Function count() counts the number of selected
    elements

//count()3 Select elements which have 3
children    ltAAAgt           ltCCCgt
               ltBBB/gt                ltBBB/gt
               ltBBB/gt           lt/CCCgt
          ltDDDgt                ltBBB/gt
          lt/DDDgt           ltEEEgt
               ltCCC/gt           lt/EEEgt
     lt/AAAgt
//count(BBB)2 Select elements which have two
children BBB      ltAAAgt           ltCCCgt
               ltBBB/gt           lt/CCCgt
          ltDDDgt                ltBBB/gt
               ltBBB/gt           lt/DDDgt
          ltEEEgt                ltCCC/gt
               ltDDD/gt           lt/EEEgt
     lt/AAAgt
11
  • Several paths can be combined with separator
    ("" stands for "or", like the logical or
    operator in C).

/AAA/EEE //DDD/CCC /AAA //BBB Number of
combinations is not restricted    ltAAAgt
          ltBBB/gt           ltCCC/gt
          ltDDDgt                ltCCC/gt
          lt/DDDgt           ltEEE/gt      lt/AAAgt
AAA/EEE //BBB Select all elements BBB and
elements EEE which are children of root element
AAA      ltAAAgt           ltBBB/gt
          ltCCC/gt           ltDDDgt
               ltCCC/gt           lt/DDDgt
          ltEEE/gt      lt/AAAgt
12
Axes are a sophisticated concept in XML to find
out which nodes relate to each other and how.
ltparentgt ltpreceding-sibling/gt
ltpreceding-sibling/gt ltnodegt
ltdescendant/gt ltdescendant/gt lt/nodegt
ltfollowing-sibling/gt ltfollowing-sibling/gt ltpa
rentgt
preceding- sibling
following- sibling
descendant
descendant
The above example illustrates how axes work.
Starting with node an axe would select the equal
named nodes. This example is also the base for
the next two pages.
13
  • The following main axes are available
  • the child axis contains the children of the
    context node
  • the descendant axis contains the descendants of
    the context node a descendant is a child or a
    child of a child and so on thus the descendant
    axis never contains attribute or namespace nodes
  • the parent axis contains the parent of the
    context node, if there is one
  • the following-sibling axis contains all the
    following siblings of the context node if the
    context node is an attribute node or namespace
    node, the following-sibling axis is empty
  • the preceding-sibling axis contains all the
    preceding siblings of the context node if the
    context node is an attribute node or namespace
    node, the preceding-sibling axis is empty
  • (http//www.w3.org/TR/xpathaxes)

14
  • The child axis contains the children of the
    context node. The child axis is the default axis
    and it can be omitted.
  • The descendant axis contains the descendants of
    the context node a descendant is a child or a
    child of a child and so on thus the descendant
    axis never contains attribute or namespace nodes.

//CCC/descendantDDD Select elements DDD which
have CCC among its ancestors      ltCCCgt
          ltDDDgt                ltEEEgt
                    lt/DDDgt                lt/EEEgt
          lt/DDDgt      lt/CCCgt
/AAA Equivalent of /childAAA      ltAAAgt
          ltBBB/gt           ltCCC/gt      lt/AAAgt
15
  • XPointer is intended to be the basis of fragment
    identifiers only for the text/xml and
    application/xml media types (they can point only
    to documents of these types).
  • Pointing to fragments of remote documents is
    analogous to the use of anchors in HTML. Roughly
    documentxpointer()

ltlink xmlnsxlink"http//www.w3.org/2000/xlink"gt
    xlinktype"simple"gt xlinkhref"mydocu
ment.xmlxpointer(//AAA/BBB1)"gt lt/linkgt
16
  • If there are forbidden characters in your
    expression, you must deal with them somehow.
  • When XPointer appears in an XML document, special
    characters must be escaped according to
    directions in XML.
  • The characters lt or must be escaped using lt
    and amp.
  • Any unbalanced parenthesis must be escaped using
    circumflex ()

ltlink xmlnsxlink"http//www.w3.org/1999/xlink"
xlinktype"simple" xlinkhref"test.xmlxpointe
r(//AAA position() lt 2)"gt Bzw. xlinkhref"tes
t.xmlxpointer(string-range('(text
in'))"gt lt/linkgt
17
  • If your elements have an ID-type attribute, you
    can address them directly using the value of the
    ID-type attribute. (Don't forget you must have
    an attribute defined as an ID type in your DTD!)
  • Using ID-type attributes, you can easily include
    or jump to parts of documents.
  • The example below selects node with id("b1").

xpointer(id("b1")) ltbookgt   ltbook id"b1"
name"XML"gtBad book.lt/bookgt   ltbook id"b2"
name"JAVA"gt Good book.
ltadditionalgtMakes me sleep like a
baby.lt/additionalgt   lt/bookgt   ltbook id"123"
name"42"gtAll answers on only one
page.lt/bookgtlt/bookgt
18
  • The specification defines one full form and one
    shorthand form (which is an abbreviation of the
    full one).
  • Short Form /1/2/3
  • Full Form xpointer(/1/2/3)

ltAAAgt   ltBBB myid"b1" bbb"111"gtText in the
first element BBB.lt/BBBgt   ltBBB myid"b2"
bbb"222"gt Text in another element BBB.
      ltDDD ddd"999"gtText in more nested
element.lt/DDDgt      ltDDD ddd"888"gtText in more
nested element.lt/DDDgt      ltDDD ddd"777"gtText
in more nested element.lt/DDDgt   lt/BBBgt   ltCCC
ccc"123" xxx"321"gtAgain some text in some
element.lt/CCCgt lt/AAAgt
19
  • A location of type point is defined by a node,
    called the container node (node that contains the
    point), and a non-negative integer, called the
    index.
  • (//AAA, //AAA/BBB are the container nodes, 1,
    2 is used if more than one container node of
    the same name exists)

xpointer(start-point(//AAA)) xpointer(start-point(
range(//AAA/BBB1))) ltAAAgt?   ltBBB
bbb"111"gtlt/BBBgt   ltBBB bbb"222"gt ltDDD
ddd"999"gtlt/DDDgt   lt/BBBgt    ltCCC ccc"123"
xxx"321"/gt lt/AAAgt
ltAAAgt   ltBBB bbb"111"gtlt/BBBgt   ltBBB
bbb"222"gt ltDDD ddd"999"gtlt/DDDgt   lt/BBBgt?
   ltCCC ccc"123" xxx"321"/gt lt/AAAgt
xpointer(end-point(range(//AAA/BBB2)))
xpointer(start-point(range(//AAA/CCC)))
20
  • When the container node of a point is of a node
    type that cannot have child nodes (such as text
    nodes, comments, and processing instructions),
    then the index is an index into the characters of
    the string-value of the node such a point is
    called a character-point.
  • You can use this to write a link that behaves
    like a search function. It always jumps to the
    first appearance of a string, e.g. the word
    "another".

xpointer(start-point(string-range(//,'another',
2, 0))) ltAAAgt   ltBBB bbb"111"gtText in the
first element BBB.lt/BBBgt   ltBBB bbb"222"gt
Text in a?nother element BBB. ltDDD
ddd"999"gtText in more nested element.lt/DDDgt   lt/
BBBgt   ltCCC ccc"123" xxx"321"gtAgain some text
in some element.lt/CCCgtlt/AAAgt
21
  • The range function returns ranges covering the
    locations in the argument location-set. For each
    location x in the argument location-set, a range
    location representing the covering range of x is
    added to the result location set.

The range-inside function returns ranges covering
the contents of the locations in the argument
location-set.
xpointer(range(//AAA/BBB2)) ltAAAgt   ltBBB
bbb"111"/gt ltBBB bbb"222"gt   Text in
another element BBB.   lt/BBBgt    ltCCC ccc"123"
xxx"321"/gtlt/AAAgt
xpointer(range-inside(//AAA/BBB2)) ltAAAgt   ltBB
B bbb"111"/gt ltBBB bbb"222"gt   Text in
another element BBB.   lt/BBBgt    ltCCC ccc"123"
xxx"321"/gtlt/AAAgt
22
  • For each location x in the argument location-set,
    end-point adds a location of type point to the
    result location-set. That point represents the
    end point of location x.

xpointer(end-point(string-range(//AAA/BBB,'another
')))  ltAAAgt   ltBBB bbb"111"gtText in the first
element BBB.lt/BBBgt   ltBBB bbb"222"gt Text
in another? element BBB. ltDDD
ddd"999"gtText in more nested element.lt/DDDgt   lt/
BBBgt   ltCCC ccc"123" xxx"321"gtAgain some text
in some element.lt/CCCgt lt/AAAgt
Write a Comment
User Comments (0)
About PowerShow.com