Web Data Management - PowerPoint PPT Presentation

About This Presentation
Title:

Web Data Management

Description:

Title: XPath Last modified by: kpassi Created Date: 6/26/2001 3:41:12 AM Document presentation format: On-screen Show (4:3) Other titles: Times New Roman Arial ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 71
Provided by: csLaurent4
Category:
Tags: data | management | prefix | test | web

less

Transcript and Presenter's Notes

Title: Web Data Management


1
Web Data Management
  • XPath

2
In this lecture
  • Review of the XPath specification
  • data model
  • examples
  • syntax
  • Resources
  • A formal semantics of patterns in XSLT by Phil
    Wadler.
  • XML Path Language (XPath) www.w3.org/TR/xpath

3
XPath
  • http//www.w3.org/TR/xpath (11/99)
  • Building block for other W3C standards
  • XSL Transformations (XSLT)
  • XML Link (XLink)
  • XML Pointer (XPointer)
  • XML Query
  • Was originally part of XSL

4
XPath
  • An expression language to be used in another host
    language (e.g., XSLT, XQuery).
  • Allows the description of paths in an XML tree,
    and the retrieval of nodes that match these
    paths.
  • Can also be used for performing some (limited)
    operations on XML data.

5
Example for XPath Queries
  • ltbibgtltbookgt ltpublishergt Addison-Wesley
    lt/publishergt ltauthorgt Serge
    Abiteboul lt/authorgt ltauthorgt
    ltfirst-namegt Rick lt/first-namegt
    ltlast-namegt Hull lt/last-namegt
    lt/authorgt ltauthorgt Victor
    Vianu lt/authorgt lttitlegt Foundations
    of Databases lt/titlegt ltyeargt 1995
    lt/yeargtlt/bookgtltbook price55gt
    ltpublishergt Freeman lt/publishergt
    ltauthorgt Jeffrey D. Ullman lt/authorgt
    lttitlegt Principles of Database and Knowledge
    Base Systems lt/titlegt ltyeargt 1998
    lt/yeargtlt/bookgt
  • lt/bibgt

6
Data Model for XPath
  • XPath expressions operate over XML trees, which
    consist of the following node types
  • Document the root node of the XML document
  • Element element nodes
  • Attribute attribute nodes, represented as
    children of an Element node
  • Text text nodes, i.e., leaves of the XML tree.

7
Data Model for XPath
The root
Processing instruction
Comment
The root element
book
book
Attr 1
element
attribute
publisher
author
. . . .
Much like the Xquery data model
Addison-Wesley
Serge Abiteboul
text
8
Data Model for XPath
  • The root node of an XML tree is the (unique)
    Document node
  • The root element is the (unique) Element child of
    the root node
  • A node has a name, or a value, or both
  • an Element node has a name, but no value
  • a Text node has a value (a character string), but
    no name
  • an Attribute node has both a name and a value.
  • Attributes are special! Attributes are not
    considered as first-class nodes in an XML tree.
    They must be addressed specifically, when needed.

9
XPath Simple Expressions
  • /bib/book/year
  • Result ltyeargt 1995 lt/yeargt
  • ltyeargt 1998 lt/yeargt
  • /bib/paper/year
  • Result empty (there were no papers)

10
XPath Tree Nodes
  • Seven nodes types
  • root, element, attribute, text, comment,
    processing instruction and namespace
  • Namespace and attribute nodes have parent nodes,
    but are not children of those parent nodes.
  • The relationship between a parent node and a
    child node is containment
  • Attribute nodes and namespace nodes describe
    their parent nodes

11
Xpath Tree Nodes
  • lt?xml version "1.0"?gt
  • lt!-- Fig. 11.3 simple2.xml --gt
  • lt!-- Processing instructions and namespacess --gt
  • lthtml xmlns "http//www.w3.org/TR/REC-html40"gt
  • ltheadgt lttitlegtProcessing
    Instruction and Namespace Nodes
  • lt/titlegt lt/headgt
  • lt?deitelprocessor example
    "fig11_03.xml"?gt
  • ltbodygt
  • ltdeitelbook deiteledition "1"
    xmlnsdeitel "http//www.deitel.com/
    xmlhtp1"gt ltdeiteltitlegtXML How to
    Programlt/deiteltitlegt lt/deitelbookgt
  • lt/bodygt
  • lt/htmlgt


12
XPath Tree Nodes
  • String-value Each XPath tree node has a string
    representation that XPath uses to compare nodes.
  • The string-value of a text node consists of the
    character data contained in the node.
  • Document order Nodes in an XPath tree have an
    ordering that is determined by the order in which
    the nodes appear in the original XML document.
  • The reverse document order is the reverse
    ordering of the nodes in a document.
  • The string-value for the html element node is
    determined by concatenating the string-values for
    all of its descendant text nodes in document
    order.
  • The string-value for element node html is
  • Processing Instruction and Namespace NodesXML How
     to Program
  • Because all whitespace is removed when the text
    nodes are normalized, there is no space in the
    concatenation.

13
XPath Tree Nodes
  • For processing instructions, the string-value
    consists of the remainder of the processing
    instruction after the target, including
    whitespace, but excluding the ending ?gt
  • The string-value for the processing instruction
    is
  • example  "fig11_03.xml"
  • Namespace-node string-values consist of the URI
    for the namespace.
  • The string-value for the namespace declaration is
  • http//www.deitel.com/xmlhtpl

14
XPath Tree Nodes
  • For the root node of the document, the
    string-value is also determined by concatenating
    the string-values of its text-node descendents in
    document order.
  • The string-value of the root node is therefore
    identical to the string-value calculated for the
    html element node
  • The string-value for the edition attribute node
    consists of its value, which is 3.
  • The string-value for a comment node consists only
    of the comment's text, excluding lt!-- and --gt.
  • The string-value for the second comment node is
    therefore Processing instructions and
    namespacess.

15
XPath Tree Nodes
  • Expanded-name Certain nodes (i.e., element,
    attribute, processing instruction and namespace)
    also have an expanded-name that can be used to
    locate specific nodes in the XPath tree.
  • Expanded-names consist of both a local part and a
    namespace URI.
  • The local part for the element node html is
    therefore html.
  • If there is a prefix for the element node, the
    namespace URI of the expanded-name is the URI to
    which the prefix is bound.
  • If there is no prefix for the element node, the
    namespace URI of the expanded name is the URI for
    the default namespace.

16
XPath Tree Nodes
  • The local part of the expanded name for a
    processing instruction node corresponds to the
    target of the processing instruction in the XML
    document.
  • For processing instructions, the namespace URI of
    the expanded-name is null
  • The local part of the expanded-name for a
    namespace node corresponds to the prefix for the
    namespace, if one exists or, if it is a default
    namespace, the local part is empty (i.e., the
    empty string).
  • The namespace URI of the expanded-name for a
    namespace node is always null.

17
XPath Tree Nodes
Node Type string-value expanded-name Description
root Determined by concatenating the string-values of all text-node descendents in document order. None Represents the root of an XML document. This node exists only at the top of the tree and may contain element, comment or processor-instruction children.
element Determined by concatenating the string-values of all text-node descendents in document order. The element tag, including the namespace prefix (if applicable). Represents an XML element and may contain element, text, comment or processor-instruction children.
attribute The normalized value of the attribute. The name of the attribute, including the namespace prefix (if applicable). Represents an attribute of an element.
text The character data contained in the text node. None. Represents the character data content of an element
comment The content of the comment (not including lt!-- and --gt). None. Represents an XML comment
processing instruction The part of the processing instruction that follows the target and any whitespace The target of the processing instruction. Represents an XML processing instruction
namespace The URI of the namespace The namespace prefix. Represents an XML namespace
18
XPath Axes
  • A location path is an expression that specifies
    how to navigate an XPath tree from one node to
    another.
  • A location path is composed of location steps,
    each of which is composed of an "axis," a "node
    test" and an optional "predicate."
  • Searching through an XML document begins at a
    context node in the XPath tree.
  • Searches through the XPath tree are made relative
    to this context node.
  • An axis indicates which nodes, relative to the
    context node, should be included in the search.
  • The axis also dictates the ordering of the nodes
    in the set.
  • Axes that select nodes that follow the context
    node in document order are called forward axes.
  • Axes that select nodes that precede the context
    node in document order are called reverse axes.

19
XPath Context
  • A step is evaluated in a specific context lt
    N1,N2, ,Nn gt, Nc which consists of
  • a context list lt N1,N2, ,Nn gt of nodes from
    the XML tree
  • a context node Nc belonging to the context list.
  • The context length n is a positive integer
    indicating the size of a contextual list of
    nodes it can be known by using the function
    last()
  • The context node position c ? 1,n is a positive
    integer indicating the position of the context
    node in the context list of nodes it can be
    known by using the function position().

20
XPath Steps
  • The basic component of XPath expression are
    steps, of the form axisnode-testP1P2. .
    . Pn
  • axis is an axis name indicating what the
    direction of the step in the XML tree is (child
    is the default).
  • node-test is a node test, indicating the kind of
    nodes to select.
  • Pi is a predicate, that is, any XPath expression,
    evaluated as a boolean, indicating an additional
    condition. There may be no predicates at all.
  • A step is evaluated with respect to a context,
    and returns a node list.

21
Path Expressions
  • A path expression is of the form
    /step1/step2/. . . /stepn
  • A path that begins with / is an absolute path
    expression
  • A path that does not begin with / is a relative
    path expression.
  • Examples
  • /A/B is an absolute path expression denoting the
    Element nodes with name B, children of the root
    named A
  • ./B/descendanttext() is a relative path
    expression which denotes all the Text nodes
    descendant of an Element B, itself child of the
    context node
  • /A/B/_at_att1.gt 2 denotes all the Attribute nodes
    _at_att1 whose value is greater than 2.

22
Evaluation of Path Expressions
  • Each stepi is interpreted with respect to a
    context its result is a node list.
  • A step stepi is evaluated with respect to the
    context of stepi-1. More precisely
  • For i 1 (first step) if the path is absolute,
    the context is a singleton, the root of the XML
    tree else (relative paths) the context is
    defined by the environment
  • For i gt 1 if N lt N1,N2, ,Nn gt is the
    result of step stepi-1, stepi is successively
    evaluated with respect to the context N,Nj ,
    for each j ? 1,n.
  • The result of the path expression is the node set
    obtained after evaluating the last step.

23
Evaluation of Path Expressions
  • Evaluation of /A/B/_at_att1
  • The path expression is absolute the context
    consists of the root node of the tree.
  • The first step, A, is evaluated with respect to
    this context.

24
Evaluation of /A/B/_at_att1
  • The result is A, the root element.
  • A is the context for the evaluation of the second
    step, B.

25
Evaluation of /A/B/_at_att1
  • The result is a node list with two nodes B1,
    B2.
  • _at_att1 is first evaluated with the context node
    B1.

26
Evaluation of /A/B/_at_att1
  • The result is the attribute node of B1.

27
Evaluation of /A/B/_at_att1
  • _at_att1 is also evaluated with the context node
    B2.

28
Evaluation of /A/B/_at_att1
  • The result is the attribute node of B2.

29
Evaluation of /A/B/_at_att1
  • Final result the node set union of all the
    results of the last step, _at_att1.

30
XPath Axes
Axes Ordering Description
Self None The context node itself.
Parent Reverse The context node's parent, if one exists.
Child Forward The context node's children, if they exist.
Ancestor Reverse The context node's ancestors, if they exist.
ancestor-or-self Reverse The context node's ancestors and also itself.
Descendant Forward The context node's descendants.
descendant-or-self Forward The context node's descendants and also itself.
Following Forward The nodes in the XML document following the context node, not including descendants.
following-sibling Forward The sibling nodes following the context node.
Preceding Reverse The nodes in the XML document preceding the context node, not including ancestors.
preceding-sibling Reverse The sibling nodes preceding the context node.
Attribute Forward The attribute nodes of the context node.
Namespace Forward The namespace nodes of the context node.
31
XPath Axes
  • An axis has a principal node type that
    corresponds to the type of node the axis may
    select.
  • For attribute axes, the principal node type is
    attribute.
  • For namespace axes, the principal node type is
    namespace.
  • All other axes have a element principal node
    type.

32
XPath Axes
  • Child axis denotes the Element or Text children
    of the context node.
  • Important An Attribute node has a parent (the
    element on which it is located), but an attribute
    node is not one of the children of its parent.
  • Example childD

33
XPath Axes
  • Parent axis denotes the parent of the context
    node.
  • The node test is either an element name, or
    which matches all names, node() which matches all
    node types.
  • Always a Element or Document node, or an empty
    node-set (if the parent does not match the node
    test or does not satisfy a predicate).
  • .. is an abbreviation for parentnode() the
    parent of the context
  • Example parentnode()

34
XPath Axes
  • Attribute axis denotes the attributes of the
    context node.
  • The node test is either the attribute name, or
    which matches all the names.
  • Example attribute

35
XPath Axes
  • Descendant axis all the descendant nodes, except
    the Attribute nodes.
  • The node test is either the node name (for
    Element nodes), or (any Element node) or text()
    (any Text node) or node() (all nodes).
  • The context node does not belong to the result
    use descendant-or-self instead.
  • Example descendantnode()

36
XPath Axes
  • Example descendant

37
XPath Axes
  • Ancestor axis all the ancestor nodes.
  • The node test is either the node name (for
    Element nodes), or node() (any Element node, and
    the Document root node).
  • The context node does not belong to the result
    use ancestor-or-self instead.
  • Example ancestornode()

38
XPath Axes
  • Following axis all the nodes that follows the
    context node in the document order.
  • Attribute nodes are not selected.
  • The node test is either the node name, text()
    or node().
  • The axis preceding denotes all the nodes that
    precede the context node.
  • Example followingnode()

39
XPath Axes
  • Following sibling axis all the nodes that
    follows the context node, and share the same
    parent node.
  • Same node tests as descendant or following.
  • The axis preceding-sibling denotes all the nodes
    the precede the context node.
  • Example following-siblingnode()

40
Location Path Abbreviations
Location path Abbreviation
child This location path is used by default if no axis is supplied and may therefore be omitted
attribute _at_
/descendant-or-selfnode()/ //
selfnode() (.)
parentnode() (..)
41
XPath Node Tests
  • The set of selected nodes is refined with node
    tests.
  • node tests rely upon the principal node type of
    an axis for selecting nodes in a location path

Node Test Description
Selects all nodes of the same principal node type.
node() Selects all nodes, regardless of their type.
text() Selects all text nodes.
comment() Selects all comment nodes.
processing-instruction() Selects all processing-instruction nodes.
node name Selects all nodes with the specified node name.
42
XPath Axes
  • Location Paths Using Axes and Node Tests
  • Location paths are composed of sequences of
    location steps.
  • A location step contains an axis and a node test
    separated by a double-colon () and, optionally,
    a "predicate" enclosed in square brackets ( ).
  • child
  • The above location path selects all element-node
    children of the context node, because the
    principal node type for the child axis is
    element.

43
XPath Wildcard
  • //author/child or //author/
  • Result ltfirst-namegt Rick lt/first-namegt
  • ltlast-namegt Hull lt/last-namegt
  • Matches any element

44
XPath Axes and Node Tests
  • childtext()
  • selects all text-node children of the context
    node
  • Combining two location steps to form the location
    path
  • child/childtext()
  • selects all text-node grandchildren of the
    context node

45
XPath Node Tests
  • /bib/book/author/text()
  • Result Serge Abiteboul
  • Victor Vianu
  • Jeffrey D. Ullman
  • Rick Hull doesnt appear because he has
    firstname, lastname
  • /bib/book/author//text()
  • Result Rick
  • Hull

46
XPath Restricted Kleene Closure
  • select all author element nodes in an entire
    document
  • /descendent-or-selfnode()/childauthor
  • Instead use the abbreviation
  • //author
  • Resultltauthorgt Serge Abiteboul lt/authorgt
  • ltauthorgt ltfirst-namegt Rick
    lt/first-namegt
  • ltlast-namegt Hull
    lt/last-namegt
  • lt/authorgt
  • ltauthorgt Victor Vianu lt/authorgt
  • ltauthorgt Jeffrey D. Ullman
    lt/authorgt
  • /bib//first-name
  • Result ltfirst-namegt Rick lt/first-namegt

47
XPath Attribute Nodes
  • /bib/book/_at_price
  • Result 55
  • _at_price means that price has to be an attribute

48
XPath Predicates
  • Boolean expression, built with tests and the
    Boolean connectors and/or (negation is expressed
    with the not() function)
  • a test is
  • either an XPath expression, whose result is
    converted to a Boolean
  • a comparison or a call to a Boolean function.
  • Important predicate evaluation requires several
    rules for converting nodes and node sets to the
    appropriate type.

49
Predicate Evaluation
  • A step is of the form axisnode-testP
  • First axisnode-test is evaluated one obtains
    an intermediate result I
  • Second, for each node in I, P is evaluated the
    step result consists of those nodes in I for
    which P is true.
  • /A/B/descendanttext()1

50
Predicate Evaluation
  • Beware an XPath step is always evaluated with
    respect to the context of the previous step.
  • Here the result consists of those Text nodes,
    first descendant (in the document order) of a
    node B.
  • /A/B//text()1

51
XPath Predicates
  • lt?xml version "1.0"?gt
  • lt!-- Fig. 11.9 books.xml --gt
  • lt!-- XML book list --gt
  • ltbooksgt
  • ltbookgt
  • lttitlegtJava How to Programlt/titlegt
  • lttranslation edition"1"gtSpanishlt/trans
    lationgt lttranslation edition"1"gtChineselt/trans
    lationgt lttranslation edition"1"gtJapaneselt/tran
    slationgt lttranslation edition"2"gtFrenchlt/trans
    lationgt lttranslation edition"2"gtJapaneselt/tran
    slationgt
  • lt/bookgt
  • ltbookgt lttitlegtC How to
    Programlt/titlegt lttranslation
    edition"1"gtKoreanlt/translationgt lttranslation
    edition"2"gtFrenchlt/translationgt lttranslation
    edition"2"gtSpanishlt/translationgt lttranslation
    edition"3"gtItalianlt/translationgt lttranslation
    edition"3"gtJapaneselt/translationgt
  • lt/bookgt
  • lt/booksgt

52
XPath Predicates
  • Select the title element node for each book that
    has a Japanese translation
  • /books/book/translation.  'Japanese'/../title
  • A predicate is a Boolean expression used as part
    of a location path to filter nodes from the
    search.
  • Select the edition attribute node for books with
    Japanese translations
  • /books/book/translation.  'Japanese'/_at_edition

53
XPath 1.0 Type System
  • Four primitive types
  • The boolean(), number(), string() functions
    convert types into each other (no conversion to
    nodesets is defined), but this conversion is done
    in an implicit way most of the time.
  • Rules for converting to a Boolean
  • A number is true if it is neither 0 nor NaN.
  • A string is true if its length is not 0.
  • A nodeset is true if it is not empty.

Type Description Literals Examples
Boolean Boolean values None true(), not(a3)
Number Floating-point 12, 12.5 1 div 33
String Ch. Strings "to", ti concat(Hello,!)
Nodeset Node set None /a/bc1 or _at_e/d
54
XPath 1.0 Type System
  • Rules for converting a nodeset to a string
  • The string value of a nodeset is the string value
    of its first item in document order.
  • The string value of an element or document node
    is the concatenation of the character data in all
    text nodes below.
  • The string value of a text node is its character
    data.
  • The string value of an attribute node is the
    attribute value.
  • Examples (Whitespace-only text nodes removed)

lta toto"3"gt ltb tititutugtltc /gtlt/bgt
ltdgttatalt/dgt lt/agt
string(/) "tata" string(/a/_at_toto
) "3" boolean(/a/b)
true() boolean(/a/e) false()
55
Operators
  • Node-set operators allow to manipulate the node
    sets to form other node sets.

Node-set Operators Description
pipe () union of node-sets (Example node()_at_)
slash (/) Separates location steps
double-slash (//) Abbreviation for the location path /descendant-or-selfnode()/
, -, , div, mod standard arithmetic operators
or, and Boolean operators (Example _at_a and c3)
lt, lt, gt, gt relational operators (Example (alt2) and (agt0))
56
Node-set Functions
  • node-set functions perform an action on a
    node-set returned by a location path

Node-set Functions Description
last() returns a number equal to the context size from the expression evaluation context
position() Returns the position number of the current node in the node-set being tested.
count( node-set ) Returns the number of nodes in node-set.
id( string ) Returns the element node whose ID attribute matches the value specified by argument string.
local-name( node-set ) Returns the local part of the expanded-name for the first node in node-set.
namespace-uri( node-set ) Returns the namespace URI of the expanded-name for the first node in node-set.
name( node-set ) Returns the qualified name for the first node in node-set.
57
Node-set Functions
  • //book/authorlast()
  • Returns the last author child of book node -
    Jeffrey D. Ullman
  • //book/authorposition() 3 or
    //book/author3
  • Selects the third author element of the book node
  • /bookcount()
  • return the total number of element-node children
    of the book node
  • //book
  • selects all book element nodes in the document

58
String Functions
String Function Description
concat(s1,...,sn) concatenates the strings s1, . . . , sn
starts-with(a,b) returns true() if the string a starts with b
contains(a,b) returns true() if the string a contains b
substring-before(a,b) returns the substring of a before the first occurrence of b
substring-after(a,b) returns the substring of a after the first occurrence of b
substring(a,n,l) returns the substring of a of length l starting at index n (indexes start from 1). l may be omitted
string-length(a) returns the length of the string a
normalize-space(a) removes all leading and trailing whitespace from a, and collapse all whitespace to a single character
translate(a,b,c) returns the string a, where all occurrences of a character from b has been replaced by the character at the same place in c
59
Boolean and Number Functions
Functions Decsription
not(b) returns the logical negation of the boolean b
sum(s) returns the sum of the values of the nodes in the nodeset s
floor(n) rounds the number n to the next lowest integer
ceiling(n) rounds the number n to the next greatest integer
round(n) rounds the number n to the closest integer
count(//) returns the number of elements in the
document normalize-space( titi toto ) returns
the string titi toto translate(baba,abcdef,A
BCDEF) returns the string BABA round(3.457)
returns the number 3
60
XPath String functions
  • lt?xml version "1.0"?gt
  • lt!-- Fig. 11.14 stocks.xsl --gt
  • lt!-- string function usage --gt
  • ltxslstylesheet version "1.0
  • xmlnsxsl "http//www.w3.org/1999/XSL/Tra
    nsform
  • ltxsltemplate match "/stocks"gt
    lthtmlgt ltbodygt ltulgt
    ltxslfor-each select "stock"gt
    ltxslif test
    "starts-with(_at_symbol,
    'C')"gt ltligt
    ltxslvalue-of
    select
    "concat(_at_symbol,' - ',name)"/gt
    lt/ligt
    lt/xslifgt lt/xslfor-eachgt
    lt/ulgt lt/bodygt
    lt/htmlgt lt/xsltemplategt
  • lt/xslstylesheetgt
  • lt?xml version "1.0"?gt
  • lt!-- Fig. 11.13 stocks.xml --gt
  • lt!-- Stock list --gt
  • ltstocksgt
  • ltstock symbol "INTC"gt
    ltnamegtIntel Corporationlt/namegt lt/stockgt ltstock
    symbol "CSCO"gt ltnamegtCisco Systems,
    Inc.lt/namegt lt/stockgt ltstock symbol "DELL"gt
    ltnamegtDell Computer Corporationlt/namegt
    lt/stockgt ltstock symbol "MSFT"gt
    ltnamegtMicrosoft Corporationlt/namegt lt/stockgt
    ltstock symbol "SUNW"gt ltnamegtSun
    Microsystems, Inc.lt/namegt lt/stockgt ltstock
    symbol "CMGI"gt ltnamegtCMGI,
    Inc.lt/namegt lt/stockgt
  • lt/stocksgt

61
XPath Qualifiers
  • /bib/book/authorfirst-name
  • Result ltfirst-namegt Rick lt/first-namegt
  • /bib/book_at_price lt 60
  • /bib/book/author_at_age lt 25
  • /bib/book/authortext()

62
XPath Examples
  • childA/descendantB B elements, descendant
    of an A element, itself child of the context
    node Can be abbreviated to A//B.
  • child/childB all the B grand-children of
    the context node
  • descendant-or-selfB elements B descendants of
    the context node, plus the context node itself if
    its name is B.
  • childBposition()last() the last child
    named B of the context node. Abbreviated to
    Blast().
  • following-siblingB1 the first sibling of
    type B (in the document order) of the context node

63
XPath Examples
  • /descendantB10 the tenth element of type B in
    the document.
  • Not the tenth element of the document, if its
    type is B!
  • childBchildC child elements B that have a
    child element C. Abbreviated to BC.
  • /descendantB_at_att1 or _at_att2 elements B that
    have an attribute att1 or an attribute att2
    Abbreviated to //B_at_att1 or _at_att2
  • selfB or selfC children elements named B
    or C

64
XPath Summary
  • bib matches a bib element
  • matches any element
  • / matches the root element
  • /bib matches a bib element under root
  • bib/paper matches a paper in bib
  • bib//paper matches a paper in bib, at any depth
  • //paper matches a paper at any depth
  • paperbook matches a paper or a book
  • _at_price matches a price attribute
  • bib/book/_at_price matches price attribute in book,
    in bib
  • bib/book_at_pricelt55/author/lastname matches

65
The Root and the Root
  • ltbibgt ltpapergt 1 lt/papergt ltpapergt 2 lt/papergt
    lt/bibgt
  • bib is the document element
  • The root is above bib
  • /bib returns the document element
  • / returns the root
  • Why ? Because we may have comments before and
    after ltbibgt they become siblings of ltbibgt

66
XPath More Details
  • Examples
  • childauthor/childlastname author/lastname
  • childauthor/descendantzip author//zip
  • childauthor/parent author/..
  • childauthor/attributeage author/_at_age
  • What does this mean ?
  • paper/publisher/parent/author
  • /bib//addressancestorbook
  • /bib//author/ancestor//zip

67
XPath Even More Details
  • name() the name of the current node
  • /bib//name()book same as /bib//book
  • What does this mean ? /bib//ancestorname()
    !book
  • In a different notation bib.book._
  • Navigation axis gives us strictly more power !

68
XPath 2.0
  • An extension of XPath 1.0, backward compatible
    with XPath 1.0. Main differences
  • Improved data model tightly associated with XML
    Schema.
  • ? a new sequence type, representing ordered set
    of nodes and/or values, with duplicates allowed.
  • ? XSD types can be used for node tests.
  • More powerful new operators (loops) and better
    control of the output (limited tree restructuring
    capabilities)
  • Extensible Many new built-in functions
    possibility to add user-defined functions.
  • XPath 2.0 is also a subset of XQuery 1.0.

69
Path expressions in XPath 2.0
  • New node tests in XPath 2.0
  • Nested paths expressions
  • Any expression that returns a sequence of nodes
    can be used as a step

Node tests Description
item() any node or atomic value
element() any element (eq. to child in XPath 1.0)
element(author) any element named author
element(, xsperson) any element of type xsperson
attribute() any attribute
/book/(author editor)/name
70
XPath 1.0 Implementations
  • libxml2 Free C library for parsing XML documents,
    supporting XPath.
  • java.xml.xpath Java package, included with JDK
    versions starting from 1.5.
  • System.Xml.XPath .NET classes for XPath.
  • XMLXPath Free Perl module, includes a
    command-line tool.
  • DOMXPath PHP class for XPath, included in PHP5.
  • PyXML Free Python library for parsing XML
    documents, supporting XPath.
Write a Comment
User Comments (0)
About PowerShow.com