Creating and Reading XML documents using 'NET - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Creating and Reading XML documents using 'NET

Description:

XML Documents Can be Read Programmatically ... ( If the file exists, it will be overwritten with the new content) The XmlTextWriter Class ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 63
Provided by: csUc8
Category:

less

Transcript and Presenter's Notes

Title: Creating and Reading XML documents using 'NET


1
Creating and Reading XML documents using .NET
  • Marios Tziakouris
  • University of Cyprus
  • EPL602
  • Fall 2004

2
XML Documents Can be Read Programmatically
  • The .NET Framework consists of many classes to
    aid in programmatically iterating through and
    navigating XML documents.
  • These classes are found in the System.Xml
    namespace.

3
Accessing XML Content
  • XML documents can be accessed in one of two ways
    in a push model or a pull model.
  • The pull model loads the entire XML document into
    memory, and then works with the document once it
    has been completely loaded.
  • The push model accesses only tiny pieces of the
    XML document when needed.

4
Comparing and Contrasting Push and Pull Approaches
5
How to use the Two Methods
  • The .NET Framework provides developers both
    methods
  • Pull Method use the DOM classes in the .NET
    Framework.
  • Push Method use the XmlReader and XmlWriter
    classes.

6
Using the Pull Method
  • The System.Xml namespace contains a number of
    classes to work with XML documents in the DOM
    paradigm
  • XmlDocument represents an XML document.
  • XmlElement represents an individual element in
    the DOM
  • XmlAttribute represents an attribute.
  • XmlText represents text content.

7
Using the Push Method
  • The XmlReader reads one node at a time from a
    specified XML source. The XmlReader can only
    read in a FORWARD direction.
  • The XmlReader class cannot be used directly
    instead, one of its derived classes must be used
    instead
  • XmlNodeReader reads one node at a time from an
    XML DOM.
  • XmlTextReader reads one node at a time from an
    XML source, such as a file with XML content.
  • XmlValidatingReader a reader that performs DTD
    or schema validation

8
Iterating through an XML Document using
XmlTextReader
  • To iterate through the contents of an XML
    document with the XmlTextReader we need to
  • Specify the XML document to iterate through when
    creating the XmlTextReader.
  • Call the Read() method, which reads in the next
    Node.
  • Access the properties of the XmlTextReader to
    determine the name, value, and other information
    about the read Node.

9
Iterating through an XML Document using
XmlTextReader
  • We can programmatically read through the contents
    of an XML file like so

// create an XmlTextReader to read the specified
XML file XmlTextReader reader new
XmlTextReader(filepath) // now, display the
information of each node in the TextBox while
(reader.Read()) // access the properties of
the XmlTextReader class... // like
reader.Name, reader.NodeType, reader.Value,
etc. // close the XmlTextReader reader.Close()
10
What is a Node?
  • Recall that the XmlReader classes read XML nodes.
    What constitutes a node? Can you identify the
    nodes in the following XML fragment?

lt?xml version1.0 encodingutf-8 ?gtltbooksgt
ltbook price34.95gt lttitlegtAnimal
Farmlt/titlegt ltauthorsgt
ltauthorgtOrwelllt/authorgt lt/authorsgt
lt/bookgtlt/booksgt
11
What is a Node?
lt?xml version1.0 encodingutf-8 ?gtltbooksgt
ltbook price34.95gt lttitlegtAnimal
Farmlt/titlegt ltauthorsgt
ltauthorgtOrwelllt/authorgt lt/authorsgt
lt/bookgtlt/booksgt
The whitespace between each element (if present)
is also considered a node! (Although, you can set
the XmlTextReaders WhitespaceHandling property
to specify if the Reader should read whitespace
nodes or not.
12
What is a Node?
lt?xml version1.0 encodingutf-8 ?gtltbooksgt
ltbook price34.95gt lttitlegtAnimal
Farmlt/titlegt ltauthorsgt
ltauthorgtOrwelllt/authorgt lt/authorsgt
lt/bookgtlt/booksgt
Notice that the attributes of an element are not
considered nodes...
13
Creating a Program to View the Content Read by an
XmlTextReader
  • We can create a program that allows the user to
    select an XML file then, the contents of the XML
    file are read by an XmlTextReader, with each read
    nodes name, type, and value displayed.(Run
    demo!)

14
Reading the Attributes
  • As we saw in the demo, the attributes are not
    read as a separate node.
  • We can determine whether or not a given node has
    attributes by the HasAttributes property.
  • In order to programmatically access the
    attributes of a node, we must use the
    MoveToNextAttribute() method of the XmlTextReader.

15
Reading the Attributes
while (reader.Read()) // C if
(reader.HasAttributes) while
(reader.MoveToNextAttribute()) // Access
the attribute name/value via //
reader.Name/reader.Value While reader.Read //
VB.NET If reader.HasAttributes then While
reader.MoveToNextAttribute() ' Access the
attribute name/value via '
reader.Name/reader.Value End While End
If End While
16
The XmlTextReader Properties and Methods
  • The properties and methods of the XmlTextReader
    can be found in Visual Studio .NET or in MSDN
  • Some other methods include
  • ReadInnerXml() returns a string with the
    complete content (including XML markup) of the
    current nodes content (child nodes, text
    content, etc.)
  • ReadOutterXml() returns a string containing the
    nodes XML markup along with the nodes content
    XML markup.

17
The XmlTextReader Properties and Methods
  • When reading an XML document, the XmlTextReader
    class will throw an XmlException if there was an
    error in parsing the XML.
  • An error can occur if the XML, for example, is
    malformed. (That is, it is not well-formed.)
  • Run the XmlException demo

18
Using the DOM to Iterate through an XML Document
  • In contrast to the Push method (XmlReader/XmlWrite
    r), the .NET Framework offers a Pull method.
  • Recall that the Pull method reads the entire XML
    document into memory and then works with it from
    there.
  • For this model, XML documents are represented in
    the Document Object Model (DOM).

19
What is the DOM?
  • DOM stands for Document Object Model, and its a
    model that can be used to describe an XML
    document.
  • The DOM expresses the XML document as a hierarchy
    of nodes, where each element can have zero to
    many children elements.
  • The text content and attributes of an element are
    expressed as its children as well.

20
Example XML File
lt?xml version"1.0" encoding"UTF-8"
?gt ltbooksgt ltbook price"34.95"gt  lttitlegtTYASP
3.0lt/titlegt ltauthorsgt ltauthorgtMitchelllt/a
uthorgt   lt/authorsgt  lt/bookgt ltbook
price29.95"gt  lttitlegtASP.NET
Tipslt/titlegt ltauthorsgt ltauthorgtMitchelllt/
authorgt ltauthorgtWaltherlt/authorgt ltauthorgtSev
enlt/authorgt lt/authorsgt  lt/bookgt lt/booksgt
21
The DOM View of the XML Document
22
The DOM Classes - XmlNode
  • There are a number of classes in the System.Xml
    namespace that represent the DOM.
  • Each box in the DOM model is represented in the
    .NET Framework by the XmlNode class.
  • This means that elements, attributes, and text
    values are all represented by the XmlNode class.

23
Extending the XmlNode Class
  • There are a number of classes that are derived
    from the XmlNode class
  • XmlAttribute
  • XmlElement
  • XmlDocument
  • And so on

24
The XmlNode Properties
  • The XmlNode class includes many properties, the
    most important ones being
  • Name the name of the node. For elements and
    attributes, the name is the name of the element
    or attribute. For text content, the name is
    text.
  • Value the value of the DOM element. For
    elements, there is no value. For attributes,
    its the value of the attribute for text nodes,
    its the value of the text in the node.
  • NodeType indicates the type of the node
    (element, text, attribute, etc.)

25
More XmlNode Properties
  • InnerXml the string content of the XML markup
    of the nodes children.
  • OuterXml the string content of the XML markup
    of the node itself and its children.
  • InnerText the string content of the value of
    the node and all its children nodes.
  • HasChildNodes a Boolean, indicating if the node
    has any children.

26
The XmlNodeList Class
  • The XmlNodeList class represents an arbitrary
    collection of XmlNodes.
  • For example, the XmlNode class has a ChildNodes
    property, which returns an XmlNodeList instance.
    This instance is a collection of nodes
    representing the DOM elements children.

27
Loading an XML Document into a DOM Representation
  • The XmlDocuments Load() method has four
    variations
  • Load(Stream)
  • Load(string)
  • Load(TextReader)
  • Load(XmlTextReader)
  • In the Load(string) variation, the input string
    is a file path (or URL) to the XML file to load
    into the DOM representation.

28
The XmlDocument Properties
  • The XmlDocument is derived from the XmlNode
    class, meaning it has all of the properties and
    methods available to the XmlNode class.
  • Once an XML file has been loaded into an
    XmlDocument instance, we can access the root
    element through the DocumentElement property.

29
The XmlElement and XmlAttribute Classes
  • The XmlElement and XmlAttribute classes are also
    derived from the XmlNode class.
  • They represent, respectively, an element and an
    attribute.

30
Example
  • The following loads and XML document and displays
    the name of the root element.

Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepa
th) Dim rootElementName as String rootElementName
xmlDoc.DocumentElement.Name
31
Example
  • Iterating through the root elements children

Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepa
th) Dim n as XmlNode For Each n in
xmlDoc.DocumentElement.ChildNodes ' Display the
name of the node using n.Name Next
32
An Example of Iterating through an XML Document
  • Lets create an application that displays an XML
    document in a TreeView control.
  • Each node in the TreeView represents a Node in
    the DOM

33
An Example of Iterating through an XML Document
  • We can recursively iterate through the DOM,
    ensuring that well visit each node.
  • View the application code...

34
Navigating through an XML Document
  • So far, all we have seen is how to iterate
    through an XML document, one node at a time.
  • With the pull method (DOM), however, we can
    navigate through the document as well.
  • For example, we might want access just the
    elements in the document that have a certain
    name. (Such as elements with the name ltauthorgt.)

35
Accessing Elements with a Certain Name
  • The XmlDocument class contains a
    GetElementsByTagName() method, which returns an
    XmlNodeList containing elements that have the
    specified tag name.

Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepa
th) Dim n as XmlNode For Each n in
xmlDoc.GetElementsByTagName("author") Display
n.Value Next
36
Navigating through an XML Document
  • However, what if we want to access nodes based on
    more complex criteria, such as Access all
    ltbookgt elements with a price attribute value less
    than 30, or, Access the name of the authors who
    have written more than one book.
  • To accomplish this we need something more
    powerful enter XPath!

37
A Quick Examination of XPath
  • XPath is used to define particular sections of an
    XML document.
  • XPath is named XPath because its syntax is
    similar to the syntax for a file path. For
    example, in our books XML document, we could use
    the following XPath statement to access all of
    the author elements
  • /books/book/authors/author

38
Navigating through the DOM using XPath
  • The XmlNode class contains two methods for
    navigating the DOM
  • SelectSingleNode(string)
  • SelectNodes(string)
  • These string input parameter for both of these
    methods is an XPath expression.
  • SelectSingleNode() returns at most one node, the
    first node to match the XPath expression.
  • SelectNodes() returns all of the nodes that match
    the XPath expression.

39
An Example
  • The following code displays the titles of books
    whose price is less than 30.00.

Dim xmlDoc As New XmlDocument() xmlDoc.Load(filepa
th) Dim n as XmlNode For Each n in _
xmlDoc.SelectNodes("/books/book_at_pricelt30/title/t
ext()") Display n.Value Next
40
More on XPath
  • There are many more features and much more
    functionality available with XPath, which well
    not examine.
  • For a good tutorial on XPath, see
    http//www.w3schools.com/xpath/default.asp.

41
Summary
  • In this presentation, we saw how to
    programmatically iterate through XML documents.
  • We examined the differences between the push and
    pull methods. The pull method uses the DOM,
    while the push method uses XmlTextReaders and
    XmlTextWriters.

42
Summary
  • We briefly studied the usage of XPath, a
    technology designed to allow for XML document
    navigation.
  • We saw how to use the SelectSingleNode() and
    SelectNodes() methods of the XmlNode class to
    navigate an XML document.
  • XML document navigation is only possible in the
    DOM world.

43
Creating XML Documents
  • Recall that XML documents can be read using both
    the push and pull method
  • The pull model loads the entire XML document in
    memory before working with it DOM
  • The push model loads only the needed portions of
    the XML document XmlReader classes.

44
Creating XML Documents
  • When creating XML documents, you can use either a
    push or pull methodology.
  • Pull use the DOM
  • Push use the XmlTextWriter class.
  • We will examine both approaches in this section.

45
Creating XML Documents with the XmlTextWriter
Class
  • The XmlTextWriter class Represents a writer that
    provides a fast, non-cached, forward-only way of
    generating streams or files containing XML data
    (Microsoft documentation)
  • Has methods that allow for the creation of
    elements, attributes, text content, XML comments,
    and so on

46
The XmlTextWriter Class
  • The XmlTextWriter can output to a file or stream.
    This is specified in the classs constructor,
    which has three forms
  • The first form accepts a TextWriter
  • The second accepts a Stream and Encoding
  • The third accepts a file path and an encoding.
    (If the file exists, it will be overwritten with
    the new content)

47
The XmlTextWriter Class
  • Important methods include
  • WriteStartDocument() outputs the lt?xml ?gt
    preprocessing directive.
  • WriteEndDocument() signals the completion of
    writing. (Closes any open elements or attributes
    and puts the writer back in its start state.)
  • Flush() flushes the output.
  • Close() closes the stream, flushing the output.

48
The XmlTextWriter Class
  • More important methods include
  • WriteStartElement(string) creates the start of
    an element with a specified name.
  • WriteEndElement() ends the element created by
    the previous WriteStartElement(string) call.
  • WriteString(string) writes text content for an
    element.
  • WriteAttributeString(string, string) writes an
    attribute with a specified name and value.

49
Examples
  • View the XmlTextWriter Demo.
  • Note that the XmlTextWriter has a Formatting
    property that can be set to either
  • Formatting.None (the default)
  • Formatting.Indented

50
Creating XML Documents with the DOM
  • The DOM classes can be used to create an XML
    document.
  • To start, you need to create a new XmlDocument
    instance.
  • To create new elements, use the
    CreateElement(string) method to create a new
    element with a specified name.
  • To add an element to an existing element, use the
    AppendChild(element) method.

51
Creating XML Documents with the DOM
  • To create new attributes, use the
    CreateAttribute(string) method.
  • To add an attribute to an element e, use
    e.Attributes.Append(attribute)
  • To create text content, use the
    CreateTextNode(string) method. Text content,
    like other elements, can be added via the
    AppendChild(element) method.

52
Creating XML Documents with the DOM
  • The content of DOM can be saved by calling the
    Save() method, which can save the results to
  • A specified file path
  • A TextReader
  • A Stream
  • An XmlWriter

53
Examples
  • View the DOMExample-VB demo. Semantically
    identical to our earlier demo.
  • Note that to apply indentation, the
    PreserveWhitespace should be set to False before
    the Save() method is called

54
Editing XML Documents with the DOM
  • In addition to creating new XML documents, the
    DOM can be used to edit existing XML documents.
  • To accomplish this, we perform the following
    steps
  • Load the XML document to edit via the XmlDocument
    classs Load() method.
  • Programmatically edit the contents of the DOM.
  • Save the changes using the Save() method.

55
Example
  • For example, imagine we wanted to find all
    instances of a particular string in just the text
    content of an XML document, and replace it with
    some other string.
  • We could load the XML document into the DOM,
    access all of its text nodes via an appropriate
    XPath statement, and then performing and find and
    replaces if needed.
  • Finally, we could then save the changes back to
    the original XML file.

56
Example
  • For example, imagine we wanted to replace the
    word book with collection of words in the
    text content for the following XML document

ltbookgt lttitlegtThe Greatest Booklt/titlegt
ltyeargt1998lt/yeargt ltauthorsgt
ltauthorgtSmithlt/authorgt ltauthorgtBooklt/authorgt
lt/authorgt lt/bookgt
57
Example
  • Wed want this to become

ltbookgt lttitlegtThe Greatest Collection of
Wordslt/titlegt ltyeargt1998lt/yeargt ltauthorsgt
ltauthorgtSmithlt/authorgt ltauthorgtCollection
of Wordslt/authorgt lt/authorgt lt/bookgt
58
Example
  • Note that we would not want this XML document to
    become the following

ltcollection of wordsgt lttitlegtThe Greatest
Collection of Wordslt/titlegt ltyeargt1998lt/yeargt
ltauthorsgt ltauthorgtSmithlt/authorgt
ltauthorgtCollection of Wordslt/authorgt
lt/authorgt lt/collection of wordsgt
We want to replace only text content, not element
names as well!
59
Example
  • Run FindAndReplace-CSharp demo.
  • Note the XPath to access all text nodes //text()
  • Realize that we could have also recursively
    iterated through the DOM searching for text nodes
    (checking each XmlNodes NodeType property) but
    the XPath approach is much cleaner and simpler.

60
Removing Nodes in a DOM
  • Elements and text nodes can be removed from the
    DOM via the RemoveChild(element) method.
  • Attributes can be removed with the
    Attributes.Remove(attribute) method.

61
Summary
  • With the XmlTextWriter, you can create an XML
    document from scratch.
  • Using the DOM, you can create an XML document
    from scratch as well as edit existing documents.

62
Questions?
Write a Comment
User Comments (0)
About PowerShow.com